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POLYNUCLEOTIDES ENCODING ANTIGENIC HIV TYPE C POLYPEPTIDES, 
POLYPEPTIDES AND USES THEREOF 

Technical Field 

5 Polynucleotides encoding antigenic Type C HIV Gag-containing polypeptides 

and/or Env-containing polypeptides are described, as are uses of these polynucleotides 
and polypeptide products in immunogenic compositions. 

Background of the Invention 
10 Acquired immime deficiency syndrome (AIDS) is recognized as one of the 

greatest health threats facing modem medicine. There is, as yet, no cure for this 
disease. 

In 1983-1984, three groups independently identified the suspected etiological agent 
of AIDS. See, e.g., Barre-Sinoussi et al, (1983) Science 220:868-871; Montagnier et 

IS al., in Human T-Cell Leukemia Viruses (Gallo, Essex & Gross, eds., 1 984); Vilmer et 
al. (1984) The Lancet 1:753; Popovic et al. (1984) Science 224:497-500; Levy et al. 
(1 984) Science 225:840-842. These isolates were variously called 
lymphadenopathy-associated virus (LAV), human T-cell lymphotropic virus type III 
(HTLV-III), or AIDS-associated retroviras (ARV). All of these isolates are strains of 

20 the same virus, and were later collectively named Human Immunodeficiency Vims 
(HIV). With the isolation of a related AIDS-causing vims, the strains originally 
called HTV are now termed HTV-l and the related vims is called HIV-2 See, e.g., 
Guyader et al. (1987) Nature 326:662-669; Brun-Vezinet et al. (1986) Science 
233:343-346; Clavel et al. (1986) Nature 324:691-695. 

25 A great deal of information has been gathered about the HIV vims, however, 

to date an effective vaccine has not been identified. Several targets for vaccine 
development have been examined including the env and Gag gene products encoded 
by HIV. Gag gene products include, but are not limited to, Gag-polymerase and Gag- 



1 



wo 00/39304 



PCT/US99/31273 



protease. Env gene products include, but are not limited to, monomeric gpl20 
polypeptides, oligomeric gpl40 polypeptides and gpl60 polypeptides. 

Haas, et al., {Current Biology 6(3):3 15-324, 1996) suggested that selective 
codon usage by HIV-1 speared to account for a substantial fraction of the 
5 inefficiency of viral protein synthesis. Andre, et al., {J. Virol 72(2): 1497- 1503, 1998) 
described an increased immune response elicited by DNA vaccination employing a 
synthetic gpl20 sequence with optimized codon usage. Schneider, et al., {J Virol 
71(7):4892-4903, 1997) discuss inactivation of inhibitory (or instabiUty) elements 
(INS) located within the coding sequences of the Gag and Gag-protease coding 
10 sequences. 

The Gag proteins of HIV-1 are necessary for the assembly of virus-like 
particles. HTV-1 Gag^ proteins are involved in many stages of the life cycle of the 
virus including, assembly, virion maturation after particle release, and early post-entry 
steps in virus replication. The roles of HIV-1 Gag proteins are numerous and 

15 complex (Freed, E.G., Virology 251:1-15, 1998). 

Wolf, et al., (PCT International Application, WO 96/30523, published 3 
October 1996; European Patent Application, Publication No. 0 449 1 16 Al, published 
2 October 1991) have described the use of altered pr55 Gag of HIV-1 to act as a non- 
infectious retroviral-like particulate carrier, in particular, for the presentation of 

20 immunologically important epitopes. Wang, et al., {Virology 200:524-534, 1994) 

describe a system to study assembly of HIV Gag-P-galactosidase fusion proteins into 
virions. They describe the construction of sequences encoding HIV Gag-P- 
galactosidase fusion proteins, the expression of such sequences in the presence of HIV 
Gag proteins, and assembly of these proteins into virus particles. 

25 Shiver, et al., (PCT Intemational AppKcation, WO 98/34640, published 1 3 

August 1998) described altering HFV-l (CAMl) Gag coding sequences to produce 
synthetic DNA molecules encoding HIV Gag and modifications of HIV Gag, The 
codons of the synthetic molecules were codons preferred by a projected host cell. 

Recently, use of HIV Env polypeptides in immunogenic composisitions has 

30 been described, (see, U.S. Patent No. 5,846,546 to Hurwitz et al., issued December 8, 
1998, describing immunogenic compositions comprising a mixture of at least four 
different recombinant virus that each express a different HIV env variant; and U.S. 
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Patent No. 5,840,313 to Vahlne et aL, issued November 24, 1998, describing peptides 
which coirespond to epitopes of the HIV-1 gpl20 protein). In addition, U.S. Patent 
No. 5,876,731 to Sia et al, issued March 2, 1999 describes candidate vaccines against 
HTV comprising an amino acid sequence of a T-cell epitope of Gag linked directly to 
5 an amino acid sequence of a B-cell epitope of the V3 loop protein of an HTV-l isolate 
containing the sequence GPGR. There remains a need for antigenic HIV 
polypeptides, particularly Type C isolates. 



Summary of the Invention 

10 The present invention relates to improved expression of HIV Type C Gag-^ 

containing polypeptides and production of virus-like particles, as well as, Env- 
containing polypeptides. 

One aspect of the present invention relates to expression cassettes and 
polynucleotides contained therein. In one embodiment, an expression cassette 

15 comprises a polynucleotide sequence encoding a polypeptide including an HTV Gag- 
containing polypeptide, wherein the poljmucleotide sequence encoding the Gag 
polypeptide comprises a sequence having at least about 85%, preferably about 90%, 
more preferably about 95%, and most preferably about 98% sequence identity to the 
sequences taught in the present specification. The polynucleotide sequences encoding 

20 Gag^-containing polypeptides include, but are not limited to, the following 

polynucleotides: nucleotides 844-903 of Figure 1 (a Gag major homology region) 
(SEQ ID NO:l); nucleotides 841-900 of Figure 2 (a Gag major homology region) 
(SEQ ID NO:2); the sequence presented as Figure 1 (SEQ ID NO:3); and the 
sequence presented as Figure 2 (SEQ ID NO:4). The polynucleotides encoding the 

25 Ga^r-containing polypeptides of the present invention may also include sequences 
encoding additional polypeptides. Such additional polynucleotides encoding 
polypeptides may include, for example, coding sequences for other HTV proteins, 
such as, polynucleotide sequences encoding an HIV protease polypeptide, and 
polynucleotide sequences encoding an HIV polymerase polypeptide. In one 

30 embodiment, the sequence encoding the HIV polymerase polypeptide can be modified 
by deletions of coding regions corresponding to reverse transcriptase and integrase. 
Such deletions in the polymerase polypeptide can also be made such that the 
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polynucleotide sequence preserves T-helper cell and CTL epitopes. Other antigens of 
interest may be inserted into the polymerase as well. 

In another embodiment, an expression cassette comprises a polynucleotide 
sequence encoding a polypeptide including an HIV jEnv-containing polypeptide, 
5 wherein the polynucleotide sequence encoding the Gag polypeptide comprises a 
sequence having at least about 85%, preferably about 90%, more preferably about 
95%, and most preferably about 98% sequence identity to the sequences taught in the 
present specification. The polynucleotide sequences encoding ^nv-containing 
polypeptides include, but are not limited to, the following polynucleotides: 

10 nucleotides 1213-1353 of Figure 3 (SEQ ID NO:5) (an Env common region); 

nucleotides 82-1512 of Figure 3 (SEQ ID NO:6) (a gpl20 polypeptide); nucleotides 
82-2025 of Figure 3 (SEQ ID NO:7) (a gpl40 polypeptide); nucleotides 82-2547 of 
Figure 3 (SEQ ID NO:8) (a gpl60 polypeptide); nucleotides 1-2547 of Figure 3 (SEQ 
ID NO:9) (a gpl60 polypeptide with signal sequence); nucleotides 1513-2547 of 

15 Figure 3 (SEQ ID NO:10) (a gp41 polypeptide); nucleotides 1210-1353 of Figure 4 
(SEQ ID NO:l 1) (an Env common region); nucleotides 73-1509 of Figure 4 (SEQ ID 
NO:12) (a gp 120 polypeptide); nucleotides 73-2022 of Figure 4 (SEQ ID NO:13) (a 
gpl40 polypeptide); nucleotides 73-2565 of Figure 4 (SEQ ID NO:14) (a gpl60 
polypeptide); nucleotides 1-2565 of Figure 4 (SEQ ID NO:15) (a gpl60 polypeptide 

20 with signal sequence); and nucleotides 1510-2565 of Figure 4 (SEQ ID NO:16) (a 
gp41 polypeptide). 

The present invention further includes recombinant expression systems for use 
in selected host cells, wherein the recombinant expression systems employ the 
polynucleotides and expression cassettes of the present invention. In such systems, 

25 the polynucleotide sequences are operably linked to control elements compatible with 
expression in the selected host cell. Numerous expression control elements are known 
to those in the art, including, but not limited to, the following: transcription 
promoters, transcription enhancer elements, transcription termination signals, 
polyadenylation sequences, sequences for optimization of initiation of translation, and 

JO translation termination sequences. Exemplary transcription promoters include, but are 
not limited to those derived fi-om CMV, CMV+intron A, SV40, RSV, HIV-Ltr, 
MMLV-ltr, and metallothionein. 



4 



wo 00/39304 



PCTAJS99/31273 



In another aspect the invention includes cells comprising the expression 
cassettes of the present invention where the polynucleotide sequence (e.g., encoding 
an Env- and/or Gag-containing polypeptide) is operably linked to control elements 
compatible with expression in the selected cell. In one embodiment such cells are 
5 mammalian cells. Exemplary mammalian cells include, but are not limited to, BHK, 
VERO, HT1080, 293, RD, COS-7, and CHO cells. Other cells, cell types, tissue 
types, etc., that may be useful in the practice of the present invention include, but are 
not limited to, those obtained from the following: insects (e.g., Trichoplusia ni (Tn5) 
and Sf9), bacteria, yeast, plants, antigen presenting cells (e.g., macrophage, 

10 monocytes, dendritic cells, B-cells, T-cells, stem cells, and progenitor cells thereof), 
primary cells, immortalized cells, tumor-derived cells. 

In a further aspect, the present invention includes compositions for generating 
an immunological response, where the composition typically comprises at least one of 
the expression cassettes of the present invention and may, for example, contain 

IS combinations of expression cassettes (such as one or more expression cassettes 
carrying a Gag-polypq)tide-encoding polynucleotide and one or more expression 
cassettes carrying an Env-polypeptide-encoding polynucleotide). Such compositions 
may further contain an adjuvant or adjuvants. The compositions may also contain one 
or more Gag-containing polypeptides and/or one or more Env-containing 

20 polypeptides. The Gag-containing polypeptides and/or Env-containing polypeptides 
may correspond to the polypeptides encoded by the expression cassette(s) in the 
composition, or, the Gag-containing polypeptides and/or Env-containing polypeptides 
may be difiTerent from those encoded by the expression cassettes. An example of the 
polynucleotide in the expression cassette encoding the same polypeptide as is being 

25 provided in the composition is as follows: the polynucleotide in the expression 
cassette encodes the Gag-polypeptide of Figure 1 (SEQ ID NO:3), and the 
polypeptide is the polypeptide encoded by the sequence shown in Figure 1 (SEQ ID 
NO: 17). An example of the polynucleotide in the expression cassette encoding a 
different polypeptide as is being provided in the composition is as follows: an 

30 expression cassette having a polynucleotide encoding a Gag-polymerase polypeptide, 
and the polypeptide provided in the composition may be a Gag and/or Gag-protease 
polypeptide. In compositions containing both expression cassettes (or 
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polynucleotides of the present invention) and polypeptides, the Env and Gag 
expression cassettes of the present invention can be mixed and/or matched with Env- 
containing and Gag-containing polypeptides described herein. 

In another aspect the present invention includes methods of immxmization of a 
subject. In the method any of the above described compositions are into the subject 
under conditions that are compatible with expression of the expression cassette in the 
subject. In one embodiment, the expression cassettes (or polynucleotides of the 
present invention) can be introduced using a gene delivery vector. The gene delivery 
vector can, for example, be a non^viral vector or a viral vector. Exemplary viral 
vectors include, but are not limited to Sindbis- virus derived vectors, retroviral vectors, 
and lentiviral vectors. Compositions useful for generating an immunological response 
can also be delivered using a particulate carrier. Further, such compositions can be 
coated on, for example, gold or tungsten particles and the coated particles delivered to 
the subject using, for example, a gene gun. The compositions can also be formulated 
as liposomes. In one embodiment of this method, the subject is a mammal and can, 
for example, be a human. 

In a further aspect, the invention includes methods of generating an immune 
response in a subject, wherein the expression cassettes or polynucleotides of the 
present invention are expressed in a suitable cell to provide for the expression of the 
Bnv' and/or Gag-containing polypeptides encoded by the polynucleotides of the 
present invention. The polypeptide(s) are then isolated (e.g., substantially purified) 
and administered to the subject in an amount sufficient to elicit an immune response. 

The invention further includes methods of generating an immune response in a 
subject, where cells of a subject are transfected with any of the above described 
expression cassettes or polynucleotides of the present invention, under conditions that 
permit the expression of a selected polynucleotide and production of a polypeptide of 
interest (e.g., encoded by any expression cassette of the present invention). By this 
method an immunological response to the polypeptide is elicited in the subject. 
Transfection of the cells may be performed ex vivo and the transfected cells are 
reintroduced into the subject. Alternately, or in addition, the cells may be transfected 
in vivo in the subject. The immune response may be humoral and/or cell-mediated 
(cellular). In a further embodiment, this method may also include administration of 
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an Env- and/or Gag-containing polypeptide before, concurrently with, and/or after 
introduction of the expression cassette into the subject. 

Further embodiments of the present invention include purified 
polynucleotides. Exemplary polynucleotide sequences encoding Gag^-containing 
5 polypeptides include, but are not limited to, the following polynucleotides: 

nucleotides 844-903 of Figure 1 (SEQ ID NO: 1) (a Gag major homology region); 
nucleotides 841-900 of Figure 2 (SEQ ID NO:2) (a Gag major homology region); the 
sequence presented as Figure 1 (SEQ ID NO:3); and the sequence presented as Figure 
2 (SEQ ID NO:4). Exemplary polynucleotide sequences encoding £^wv^containing 

10 polypeptides include, but are not Umited to, the following polynucleotides: 

nucleotides 1213-1353 of Figure 3 (SEQ ID NO:5) (an Env common region); 
nucleotides 82-1512 of Figure 3 (SEQ ID NO:6) (a gpl20 polypeptide); nucleotides 
82-2025 of Figure 3 (SEQ ID NO:7) (a gpl40 polypeptide); nucleotides 82-2547 of 
Figure 3 (SEQ ID NO:8) (a gpl60 polypeptide); nucleotides 1-2547 of Figure 3 (SEQ 

15 ID NO:9) (a gp 1 60 polypeptide with signal sequence); nucleotides 151 3-2547 of 
Figure 3 (SEQ ID NO:10) (a gp41 polypeptide); nucleotides 1210-1353 of Figure 4 
(SEQ ID NO:l 1) (an Env common region); nucleotides 73-1509 of Figure 4 (SEQ ID 
NO:12) (a gpl20 polypeptide); nucleotides 73-2022 of Figure 4 (SEQ ID NO:13) (a 
gpl40 polypeptide); nucleotides 73-2565 of Figure 4 (SEQ ID NO:14) (a gpl60 

20 polypeptide); nucleotides 1-2565 of Figure 4 (SEQ ID NO:15) (a gpl60 polypqjtide 
with signal sequence); and nucleotides 1510-2565 of Figure 4 (SEQ ID NO: 16) (a 
gp41 polypeptide). The polynucleotide sequence encoding the Ga^-containing and 
Env-containing polypeptides of the present invention typically have at least about 
85%, preferably about 90%, more preferably about 95%, and most preferably about 

25 98% sequence identity to the sequences taught herein. 

The polynucleotides of the present invention can be produced by recombinant 
techniques, synthetic techniques, or combinations thereof. 

These and other embodiments of the present invention will readily occur to 
those of ordinary skill in the art in view of the disclosure herein. 
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Brief Description op the Figures 

Figure 1 (SEQ ID NO:3) shows the nucleotide sequence of a polynucleotide 
encoding a synthetic Gag polypeptide. The nucleotide sequence shown was obtained 
by modifying type C strain AFl 10965 and include further modifications of INS. 

Figure 2 (SEQ ID NO: 4) shows the nucleotide sequence of a polynucleotide 
encoding a synthetic Gag polypeptide. The nucleotide sequence shown was obtained 
by modifying type C strain AFl 10967 and include further modifications of INS. 

Figure 3 (SEQ ID NO:9) shows the nucleotide sequence of a polynucleotide 
encoding a synthetic Env polypeptide. The nucleotide sequence depicts gpl60 
(including a signal peptide) and was obtained by modifying type C strain AFl 10968. 
The arrows indicate the positions of various regions of the polynucleotide, including 
the sequence encoding a signal peptide (nucleotides 1-81) (SEQ ID NO:18), a gpl20 
polypeptide (nucleotides 82-1512) (SEQ ID NO:6), a gp41 polypeptide (nucleotides 
1513-2547) (SEQ ID NO:10), a gp 140 polypeptide (nucleotides 82-2025) (SEQ ID 
NO:7) and a gpl60 polypeptide (nucleotides 82-2547) (SEQ ID NO:8). The codons 
encoding the signal peptide are modified (as described herein) fi-om the native HIV-1 
signal sequence. 

Figure 4 (SEQ ID NO: 15) shows the nucleotide sequence of a polynucleotide 
encoding a synthetic Env polypeptide. The nucleotide sequence depicts gpl60 
(including a signal peptide) and was obtained by modifying type C strain AFl 10975. 
The arrows indicate the positions of various regions of the polynucleotide, including 
the sequence encoding a signal pqjtide (nucleotides 1-72) (SEQ ID NO:19), a gpl20 
polypeptide (nucleotides 73-1509) (SEQ ID NO:12), a gp41 polypeptide (nucleotides 
1510-2565) (SEQ ID NO:16), a gpl40 polypeptide (nucleotides 73-2022) (SEQ ID 
NO: 1 3), and a gpl 60 polypeptide (nucleotides 73-2565) (SEQ ID NO: 1 4). The 
codons encoding tlie signal peptide are modified (as described herein) from the native 
HIV-1 signal sequence. 

Figure 5 shows the location of some remaining INS in synthetic Gag 
sequences derived from AFl 10965. The changes made to these sequences are boxed 
in the Figures. The top line depicts a codon optimized sequence of Gag polypeptides 
from the indicated strains (SEQ ID NO:20). The nucleotide(s) appearing below the 

8 
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line in the boxed region(s) depicts changes made to remove further INS and 
correspond to the sequence depicted in Figure 1 (SEQ ID NO:3). 

Figure 6 shows the location of some remaining INS in synthetic Gag 
sequences derived from AFl 10968. The changes made to these sequences are boxed 
5 in the Figures. The top line depicts a codon optimized sequence of Gag polypeptides 
from the indicated strains (SEQ ID NO:21). The nucleotide(s) appearing below the 
line in the boxed region(s) depicts changes made to remove fiirther INS and 
correspond to the sequence depicted in Figure 2 (SEQ ID NO:4). 

10 Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of chemistry, biochemistry, molecular biology, immunology 
and pharmacology, within the skill of the art. Such techniques are explained fully in 
the Hterature. See, e.g.. Remington's Pharmaceutical Sciences, 18th Edition (Easton, 

15 Pennsylvania: Mack Publishing Company, 1990); Methods In Enzymology (S. 

Colowick and N. Kaplan, eds.. Academic Press, Inc.); and Handbook of Experimental 
Immunology, Vols. I-IV (D.M. Weir and C.C. Blackwell, eds., 1986, Blackwell 
Sci^tific Publications); Sambrook, et al. ^ Molecular Cloning: A Laboratory Manual 
(2nd Edition, 1989); Short Protocols in Molecular Biology^ 4th ed. (Ausubel et al, 

20 eds., 1999, John Wiley & Sons); Molecular Biology Techniques: An Intensive 

Laboratory Course, (Ream et al., eds., 1998, Academic Press); PCR (Introduction to 
Biotechniques Series), 2nd ed. (Newton & Graham eds., 1997, Springer V^lag). 

As used in this specification and the appended claims, the singular forms "a," 
"an" and "the" include plural references unless the content clearly dictates otherwise. 

25 Hius, for example, reference to "an antigen'' includes a mixture of two or more such 
agents. 



30 



1* Definitions 

In describing the present invention, the following terms will be employed, and 
are intended to be defined as indicated below. 
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"Synthetic" sequences, as used herein, refers to Gag-encoding polynucleotides 
whose expression has been optimized as described herein, for example, by codon 
substitution and inactivation of inhibitory sequences. "Wild-type" or 
"native" sequences, as used herein, refers to polypeptide encoding sequences that are 
essentially as they are found in nature, e.g.. Gag and/or Env encoding sequences as 
found in Type C isolates, e,g„ AF110965, AFl 10967, AFl 10968 or AFl 10975. 

As used herein, the term "virus-like particle" or "VLP" refers to a 
nonrephcating, viral shell, derived from any of several viruses discussed further 
below. VLPs are generally composed of one or more viral proteins, such as, but not 
limited to those proteins referred to as capsid, coat, shell, surface and/or envelope 
proteins, or particle-forming polypeptides daived from these proteins. VLPs can 
form spontaneously upon recombinant expression of the protein in an appropriate 
expression system. Methods for producing particular VLPs are known in the art and 
discussed more fully below. The presence of VLPs following recombinant expression 
of viral proteins can be detected using conventional techniques known in the art, such 
as by electron microscopy. X-ray crystallography, and the like. See, e.g.. Baker et al., 
^lopAj/j. J. (1991) 60:1445-1456; Hagenseeetal.,y. Firo/. (1994) 68:4503-4505. For 
example, VLPs can be isolated by density gradient centrifiigation and/or identified by 
characteristic density banding. Alternatively, cryoelectron microscopy can be 
performed on vitrified aqueous samples of the VLP preparation in question, and 
images recorded imder appropriate exposure conditions. 

By "particle-forming polyp^tide" derived from a particular viral protein is 
meant a fiilHength or near full-length viral protein, as well as a fragment thereof, or a 
viral protein with internal deletions, which has the ability to form VLPs under 
conditions that favor VLP formation. Accordingly, the polypeptide may comprise the 
full-length sequence, fragments, truncated and partial sequences, as well as analogs 
and precursor forms of the reference molecule. The term therefore intends deletions, 
additions and substitutions to the sequence, so long as the polypeptide retains the 
ability to form a VLP. Thus, the term includes natural variations of the specified 
polypeptide since variations in coat proteins often occur between viral isolates. The 
term also includes deletions, additions and substitutions that do not naturally occur in 
the reference protein, so long as the protein retains the ability to form a VLP. 

10 
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Preferred substitutions are those which are conservative in nature, i.e., those 
substitutions that take place within a family of amino acids that are related in their 
side chains. Specifically, amino acids are generally divided into four families: (1) 
acidic - aspartate and glutamate; (2) basic - lysine, arginine, histidine; (3) non-polar 
5 — alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; 
and (4) uncharged polar — glycine, asparagine, glutamine, cystine, serine threonine, 
tyrosine. Phenylalanine, tryptophan, and tyrosine are sometimes classified as 
aromatic amino acids. 

An "antigen" refers to a molecule containing one or more epitopes (either 

10 linear, conformational or both) that will stimulate a host's immune system to make a 
humoral and/or cellular antigen-specific response. The term is used interchangeably 
with the term "immunogen." Normally, a B-cell epitope will include at least about 5 
amino acids but can be as small as 3-4 amino acids. A T-cell epitope, such as a CTL 
q>itope, will include at least about 7-9 amino acids, and a helper T-cell epitope at least 

15 about 12-20 amino acids. Normally, an epitope will include between about 7 and 15 
amino acids, such as, 9, 10, 12 or 15 amino acids. The term "antigen" denotes both 
subunit antigens, (i.e., antigens which are separate and discrete fix>m a whole 
organism with which the antigen is associated in nature), as well as, killed, attenuated 
or inactivated bacteria, vimses, fimgi, parasites or other microbes. Antibodies sudi as 

20 anti-idiotype antibodies, or firagments thereof, and syntiietic peptide mimotopes, 
which can mimic an antigen or antigenic determinant, are also captured under the 
definition of antigm as used herein. Similarly, an oligonucleotide or polynucleotide 
which expresses an antigen or antigenic determinant in vivo, such as in gene therapy 
and DNA immunization £q>plications, is also included in the definition of antigen 

25 herein. 

For purposes of the present invention, antigens can be derived from any of 
several known viruses, bacteria, parasites and fungi, as described more fiilly below. 
The term also intends any of the various tumor antigens. Furthermore, for purposes of 
the present invention, an "antigen" refers to a protein which includes modifications, 
30 such as deletions, additions and substitutions (generally conservative in natiu"e), to the 
native sequence, so long as the protein maintains the ability to elicit an 
immunological response, as defined herein. These modifications may be deKberate, 
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as through site-directed mutagenesis, or may be accidental, such as through mutations 
of hosts which produce the antigens. 

An '^immunological response" to an antigen or composition is the development 
in a subject of a humoral and/or a cellular immune response to an antigen present in 
5 the composition of interest. For purposes of the present invention, a "humoral 

immune response** refers to an immune response mediated by antibody molecules, 
while a •'cellular immune response** is one mediated by T-lymphocytes and/or other 
white blood cells. One important aspect of cellular immunity involves an antigen- 
specific response by cytolytic T-cells ('*CTL"s). CTLs have specificity for peptide 

10 antigens that are presented in association with proteins encoded by the major 

histocompatibility complex (MHC) and expressed on the surfaces of cells. CTLs help 
induce and promote the destruction of intracellular microbes, or the lysis of cells 
infected with such microbes. Another aspect of cellular immunity involves an 
antigen-specific response by helper T-cells. Helpo- T-cells act to help stimulate the 

15 function, and focus the activity of, nonspecific effector cells against cells displaying 
peptide antigens in association with MHC molecules on their surface. A "cellular 
immime response" also refers to the production of cytokines, chemokines and other 
such molecules produced by activated T-cells and/or other white blood cells, 
including those derived from CD4+ and CD8+ T-cells. 

20 A composition or vaccine that elicits a cellular immune response may serve to 

sensitize a vertebrate subject by the presentation of antigen in association with MHC 
molecules at the cell surface. The cell-mediated immune response is directed at, or 
near, cells presenting antigen at their surface. In addition, antigen-specific T- 
lymphocytes can be generated to allow for the future protection of an immunized host. 

25 The ability of a particular antigen to stimulate a cell-mediated immunological 

response may be determined by a number of assays, such as by lymphoproliferation 
(lymphocyte activation) assays, CTL cytotoxic cell assays, or by assaying for T- 
lymphocytes specific for the antigen in a sensitized subject. Such assays are well 
known in the art. See, e.g., Erickson et al., J. Immunol (1993) 151:4189-4199; Doe et 

30 al., Eur, J, Immunol (1994) 24:2369-2376. Recent methods of measuring cell- 
mediated immune response include measurement of intracellular cytokines or 
cytokine secretion by T-cell populations, or by measurement of epitope specific T- 
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cells (e.g., by the tetramer techiiique)(reviewed by McMichael, A J., and O'Callaghan, 
C.A., J. Exp. Med. 187(9)1367-1371, 1998; Mcheyzer-Wimams, M.G., et al, 
Immunol Rev. 150:5-21, 1996; Lalvani, A., et al, J. Exp. Med. 186:859-865, 1997). 

Thus, an inununological response as used herein may be one which stimulates 
5 the production of CTLs, and/or the production or activation of helper T- cells. The 
antigen of interest may also elicit an antibody-mediated immime response. Hence, an 
immunological response may include one or more of the following effects: the 
production of antibodies by B-cells; and/or the activation of suppressor T-cells and/or 
Y6 T-cells directed specifically to an antigen or antigens present in the composition or 

10 vaccine of interest. These responses may sCTve to neutralize infectivity, and/or 

mediate antibody-complement, or antibody dependent cell cytotoxicity (ADCC) to 
provide protection to an immunized host. Such responses can be determined using 
standard immunoassays and neutralization assays, well known in the art. 

An "inmiimogenic composition*' is a composition tiiat comprises an antigenic 

15 molecule where administration of the composition to a subject results in the 

development in the subject of a hiunoral and/or a cellular immune response to the 
antigenic molecule of interest. The immunogenic composition can be introduced 
directly into a recipient subject, such as by injection, inhalation, oral, intranasal and 
mucosal (e.g., intra-rectally or intra-vaginally) administration. 

20 By "subunit vaccine" is meant a vaccine composition which includes one or 

more selected antigens but not all antigens, derived fh)m or homologous to, an 
antigen from a pathogen of interest such as from a virus, bact^um, parasite or 
fungus. Such a composition is substantially fi^ of intact pathogen cells or 
pathogenic particles, or the lysate of such cells or particles. Thus, a "subunit vaccine" 

25 can be prepared fix>m at least partially pxirified (preferably substantially purified) 
immunogenic polypeptides from the pathogen, or analogs thereof. The method of 
obtaining an antigen included in the subunit vaccine can thus include standard 
purification techniques, recombinant production, or synthetic production. 

"Substantially pxirified" general refers to isolation of a substance (compound, 

30 polynucleotide, protein, polypeptide, polypeptide composition) such that the 

substance comprises the majority percent of the sample in which it resides. Typically 
in a sample a substantially purified component comprises 50%, preferably 80%-85%, 
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more preferably 90-95% of the sample. Techniques for purifying polynucleotides and 
polypeptides of interest are well-known in the art and include, for example, ion- 
exchange chromatography, afiBnity chromatography and sedimentation according to 
density. 

A "coding sequence" or a sequence which "encodes" a selected polypeptide, is 
a nucleic acid molecule which is transcribed (in the case of DNA) and translated (in 
the case of mRNA) into a polypeptide in vivo when placed under the control of 
appropriate regulatory sequences (or "control elements**). The boundaries of the 
coding sequence are determined by a start codon at the 5' (amino) terminus and a 
translation stop codon at the 3' (carboxy) terminus. A coding sequence can include, 
but is not limited to, cDNA from viral, procaryotic or eucaryotic mRNA, genomic 
DNA sequences from viral or procaryotic DNA, and even synthetic DNA sequences. 
A transcription termination sequence may be located 3' to the coding sequence. 

Typical "control elements**, include, but are not limited to, transcription 
promoters, transcription enhancer elements, transcription termination signals, 
polyadenylation sequences (located 3* to the translation stop codon), sequences for 
optimization of initiation of translation (located 5* to the coding sequence), and 
translation termination sequences. 

A "nucleic acid" molecule can include, but is not limited to, procaryotic 
sequences, eucaryotic mRNA, cDNA from eucaryotic mRNA, genomic DNA 
sequences from eucaryotic (e.g., mammalian) DNA, and even synthetic DNA 
sequences. The term also c^tures sequences that include any ofthe known base 
analogs of DNA and RNA. 

"Operably linked" refars to an arrangement of elements wherein the 
components so described are configured so as to perform their usual function. Thus, a 
given promoter operably linked to a coding sequence is capable of effecting the 
expression of the coding sequence when the proper enzymes are present. The 
promoter need not be contiguous with the coding sequence, so long as it functions to 
direct the expression thereof Thus, for example, intervening imtranslated yet 
transcribed sequences can be present between the promoter sequence and the coding 
sequence and the promoter sequence can still be considered "operably linked" to the 
coding sequence. 
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"Recombinant** as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue 
of its origin or manipiilation: (1) is not associated with all or a portion of the 
polynucleotide with which it is associated in nature; and/or (2) is linked to a 
5 polynucleotide other than that to which it is linked in nature. The term "recombinant" 
as used with respect to a protein or polypeptide means a polypeptide produced by 
expression of a recombinant polynucleotide. "Recombinant host cells," "host cells," 
"cells," "cell lines," "cell cultures," and other such terms denoting procaryotic 
microorganisms or eucaryotic cell lines cultured as unicellular entities, are used inter- 

10 changeably, and refer to cells which can be, or have been, used as recipients for 

recombinant vectors or other transfer DNA, and include the progeny of the original 
cell which has been transfected. It is understood that the progeny of a single parental 
cell may not necessarily be completely identical in morphology or in genomic or total 
DNA complement to the original parent, due to accidental or deliberate mutation. 

15 Progeny of the parental cell which are sufficiently similar to the parent to be 

characterized by the relevant property, such as the presence of a nucleotide sequence 
encoding a desired peptide, are included in the progeny intended by this definition, 
and are covered by the above temis. 

Techniques for determining amino acid sequence ^^similarity** are well known 

20 in the art. In general, "similarity" means the exact amino acid to amino acid 

comparison of two or more polypeptides at the ^propriate place, where amino acids 
are identical or possess similar chemical and/or physical properties such as charge or 
hydrophobicity. A so-temied "percent similarity" then can be determined between the 
compared polypeptide sequences. Techniques for determining nucleic acid and amino 

25 acid sequence identity also are well known in the art and include determining the 
nucleotide sequence of the mRNA for that gene (usually via a cDN A intermediate) 
and determining the amino acid sequence encoded thereby, and comparing this to a 
second amino acid sequence. In general, "identity" refers to an exact nucleotide to 
nucleotide or amino acid to amino acid correspondence of two polynucleotides or 

30 polypeptide sequences, respectively. 

Two or more polynucleotide sequences can be compared by determining their 
"percent identity." Two or more amino acid sequences likewise can be compared by 
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detennining their '•percent identity." The percent identity of two sequences, whether 
nucleic acid or peptide sequ^ces, is generally described as the number of exact 
matches between two aligned sequences divided by the length of the shorts sequence 
and multiplied by 100. An approximate alignment for nucleic acid sequences is 
provided by the local homology algorithm of Smith and Waterman, Advances in 
Applied Mathematics 2:482-489 (1981). This algorithm can be extended to use with 
peptide sequences using the scoring matrix developed by Dayhoff, Atlas of Protein 
Sequences and Stmcture, M.O. DayhofF ed., 5 suppL 3:353-358, National Biomedical 
Research Foundation, Washington, D.C., USA, and normalized by Gribskov, Nucl. 
Acids Res. 14(6):6745-6763 (1986). An implementation of this algorithm for nucleic 
acid and peptide sequences is provided by the Genetics Computer Group (Madison, 
WI) in their BestFit utility application. The default parameters for this method are 
described in the Wisconsin Sequence Analysis Package Program Manual, Version 8 
(1995) (available from Genetics Computer Group, Madison, WI). Other equally 
suitable programs for calculating the percent identity or similarity between sequences 
are generally known in the art. 

For example, percent identity of a particular nucleotide sequence to a reference 
sequence can be determined using the homology algorithm of Smith and Waterman 
with a defauh scormg table and a g^ penalty of six nucleotide positions. Another 
method of establishing percent identity in the context of the present invention is to use 
the MPSRCH package of programs copyrighted by the University of Edinburgh, 
developed by John F. Collins and Shane S. Sturrok, and distributed by IntelliGenetics, 
Inc. (Mountain View, CA). From this suite of packages, the Smith-Waterman 
algorithm can be employed where default parameters are used for the scoring table 
(for example, gap open penalty of 1 2, gap extension penalty of one, and a gap of six). 
From the data generated, the "Match" value reflects "sequence identity." Other 
suitable programs for calculating the percent identity or similarity between sequences 
are generally known in the art, such as the alignment program BLAST, which can also 
be used with default parameters. For example, BLASTN and BLASTP can be used 
with the following default parameters: genetic code = standard; filter = none; strand = 
both; cutoff = 60; expect = 10; Matrix = BLOSIJM62; Descriptions = 50 sequences; 
sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + 
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PDB + GenBank CDS translations + Swiss protein + Spupdate + PIR. Details of these 
programs can be found at the following internet address: 
http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

One of skill in the art can readily determine the proper search paramet^ to 
S use for a given sequence in the above programs. For example, the search parameters 
may vary based on the size of the sequence in question. Thus, for example, a 
representative embodiment of the present invention would include an isolated 
polynucleotide having X contiguous nucleotides, wherein (i) the X contiguous 
nucleotides have at least about 50% identity to Y contiguous nucleotides derived fix>m 

10 any of the sequences described hCTein, (ii) X equals Y, and (iii) X is greater than or 
equal to 6 nucleotides and up to 5000 nucleotides, preferably greater than or equal to 
8 nucleotides and up to 5000 nucleotides, more preferably 10-12 nucleotides and up to 
5000 nucleotides, and even more preferably 15-20 nucleotides, up to the number of 
nucleotides present in the full-length sequ^ces described herein (e.g., see the 

15 Sequence Listing and claims), including all integer values falling within the above- 
described ranges. 

The synthetic expression cassettes (and purified polynucleotides) of the 
present invention include related polynucleotide sequences having about 80% to 
100%, greater than 80-85%, preferably greater than 90-92%, more preferably greater 

20 than 95%, and most preferably greater than 98% sequence (including all integer 
values falling within these described ranges) identity to the synthetic expression 
cassette sequences disclosed herein (for example, to the claimed sequences or other 
sequences of the present invention) when the sequences of the preset invention are 
used as the query sequence. 

25 Two nucleic acid fragments are considered to ^'selectively hybridize" as 

described herein. The degree of sequence identity between two nucleic acid 
molecules affects the efficiency and strength of hybridization events between such 
molecules. A partially identical nucleic acid sequence will at least partially inhibit a 
completely identical sequence fi-om hybridizing to a target molecule. Inhibition of 

30 hybridization of the completely identical sequence can be assessed using 

hybridization assays that are well known in the art (e.g.. Southern blot. Northern blot, 
solution hybridization, or the like, see Sambrook, et al., supra or Ausubel et al., 
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supra). Such assays can be conducted using varying degrees of selectivity, for 
example, using conditions varying from low to high stringency. If conditions of low 
stringency are employed, the absence of non-specific binding can be assessed using a 
secondary probe that lacks even a partial degree of sequence identity (for example, a 
probe having less than about 30% sequence identity with the target molecule), such 
that, in the absence of non-specific binding events, the secondary probe will not 
hybridize to the target. 

When utilizing a hybridization-based detection system, a nucleic acid probe is 
chosen that is complementary to a target nucleic acid sequence, and then by selection 
of appropriate conditions the probe and the target sequence "selectively hybridize," or 
bind, to each other to form a hybrid molecule. A nucleic acid molecule that is capable 
of hybridizing selectively to a target sequence under "moderately stringent" typically 
hybridizes under conditions that allow detection of a target nucleic acid sequence of at 
least about 10-14 nucleotides in length having at least proximately 70% sequence 
identity with the sequence of the selected nucleic acid probe. Stringent hybridization 
conditions typically allow detection of target nucleic acid sequences of at least about 
10-14 nucleotides in length having a sequence identity of greater than about 90-95% 
with tile sequence of tiie selected nucleic acid probe. Hybridization conditions usefiil 
for probe/taiget hybridization where the probe and target have a specific degree of 
sequence identity, can be determined as is known in the art (see, for example. Nucleic 
Acid Hvbridization: A Practical Approach, editors B.D. Hames and S.J. Higgins, 
(1985) Oxford; Washington, DC; IRL Press). 

With respect to string«icy conditions for hybridization, it is well known in the 
art that numerous equivalent conditions can be employed to establish a particular 
stringency by varymg, for example, tfie following factors: tfie length and nature of 
probe and target sequences, base composition of the various sequences, 
concentrations of salts and other hybridization solution components, the presence or 
absence of blocking agents in the hybridization solutions (e.g., fomiamide, dextran 
sulfate, and polyethylene glycol), hybridization reaction temperature and time 
parameters, as well as, varying wash conditions. The selection of a particular set of 
hybridization conditions is selected following standard methods in the art (see, for 
example, Sambrook, et al., supra or Ausubel et al., supra). 
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A fiurst polynucleotide is Moived firom" second polynucleotide if . 
same or substantially the same basqiair sequence as a region of the second 
polynucleotide, its cDNA, complements thereof, or if it displays sequence identity as 
described above. 

5 A first polypeptide is "derived from" a second polypeptide if it is (i) encoded 

by a first polynucleotide derived from a second polynucleotide, or (ii) displays 
sequaice identity to the second polypeptides as described above. 

Generally, a viral polypeptide is "derived from" a particular polypeptide of a 
virus (viral polypeptide) if it is (i) encoded by an open reading frame of a 

10 polynucleotide of that virus (viral polynucleotide), or (ii) displays sequence identity to 
polyp^tides of that virus as described above. 

"Encoded by" refers to a nucleic acid sequence which codes for a polypeptide 
sequence, wherein the polypeptide sequence or a portion thereof contains an amino 
acid sequence of at least 3 to S amino acids, more preferably at least 8 to 10 amino 

15 acids, and even more preferably at least 1 5 to 20 amino acids from a polypeptide 

encoded by the nucleic acid sequence. Also encompassed are polypeptide sequences 
which are immunologically identifiable with a polypeptide encoded by the sequence. 

*Turified polynucleotide" refers to a polynucleotide of interest or firagment 
thereof which is essentially free, e.g., contains less than about 50%, preferably less 

20 than about 70%, and more preferably less than about 90%, of the protein with which 
the polynucleotide is naturally associated. Techniques for purifying polynucleotides 
of interest are well-known in the art and include, for example, disruption of the cell 
containing the polynucleotide with a chaotropic agent and separation of the 
polynucleotide(s) and proteins by ion-exchange chromatography, affinity 

25 chromatography and sedimentation according to density. 

By "nucleic acid immunization" is meant the introduction of a nucleic acid 
molecule encoding one or more selected antigens into a host cell, for the in vivo 
expression of an antigen, antigens, an epitope, or epitopes. The nucleic acid molecule 
can be introduced directly into a recipient subject, such as by injection, inhalation, 

30 oral, intranasal and mucosal administration, or the like, or can be introduced ex v/vo, 
into cells which have been removed from the host. In the latter case, the transformed 
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cells are reintroduced into the subject where an immune response can be mounted 
against the antigen encoded by the nucleic acid molecule. 

"Gene transfer" or "gene delivery" refers to methods or systems for reliably 
inserting DNA of interest into a host cell. Such methods can result in transient 
5 expression of non-integrated transferred DNA, extrachromosomal replication and 
expression of transferred replicons (e.g., episomes), or integration of transferred 
genetic material into the genomic DNA of host cells. Gene delivery expression 
vectors include, but are not limited to, vectors derived from alphaviruses, pox viruses 
and vaccinia viruses. When used for immunization, such gene delivery expression 

10 vectors may be referred to as vaccines or vaccine vectors. 

"T lymphocytes" or "T cells'' are non-antibody producing lymphocytes that 
constitute a part of the cell-mediated arm of the immune system. T cells arise from 
immature lymphocytes that migrate from the bone marrow to the thymus, where they 
undergo a maturation process under the direction of thymic hormones. Here, the 

15 mature lymphocytes rapidly divide increasing to very large numbers. The maturing T 
cells become immunocompetent based on their ability to recognize and bind a specific 
antigen. Activation of immunocompetent T cells is triggered when an antigen binds 
to the lymphocyte's surface receptors. 

The term "transfection" is used to refer to die uptake of foreign DNA by a cell. 

20 A cell has been "transfected" when exogenous DNA has been introduced inside the 
cell membrane. A number of transfection techniques are generally known in the art. 
See, e,g., Graham et al. (1973) Virology, 52:456, Sambrook et al. (1989) Molecular - 
Cloning, a laboratory manual. Cold Spring Haibor Laboratories, New Yoric, Davis et 
al. (1986) Basic Methods in Molecular Biology, Elsevier, and Chu et al. (1981) Gene 

15 13:197, Such techniques can be used to introduce one or more exogenous DNA 

moieties into suitable host cells. The term refers to both stable and transient uptake of 
the genetic material, and includes uptake of peptide- or antibody-linked DNAs. 

A "vector" is capable of transferring gene sequences to target cells (e.g., viral 
vectors, non- viral vectors, particulate carriers, and liposomes). Typically, "vector 

JO construct," "expression vector," and "gene transfer vector," mean any nucleic acid 
construct capable of directing the expression of a gene of interest and which can 
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transfer gene sequences to target cells. Thus, the term includes cloning and 
expression vehicles, as well as viral vectors. 

Transfer of a "suicide gene" (e.g., a drug-susceptibility gene) to a target cell 
renders the cell sensitive to compounds or compositions that are relatively nontoxic to 
5 normal cells. Moolten, FX. (1994) Cancer Gewe 7%er. 1:279-287. Examples of 
suicide genes are thymidine kinase of herpes simplex virus (HSV-tk), cytochrome 
P450 (Manome et al. (1996) Gene Therapy 3:513-520), human deoxycytidine kinase 
(Manome et al. (1996) Nature Medicine 2(5):567-573) and the bacterial enzyme 
cj^osine deaminase (Dong et al. (1996) Human Gene Therapy 7:713-720). Cells 

10 which express these genes are rendered sensitive to the effects of the relatively 

nontoxic prodrugs ganciclovir (HSV-tk), cyclophosphamide (cytochrome P450 2B1), 
cytosine arabinoside (human deoxycytidine kinase) or 5-fluorocytosine (bacterial 
cytosine deaminase). Culver et al. ( 1 992) Science 256: 1550-1 552, Ruber et al. ( 1 994) 
Proc. Natl. Acad, ScL USA 91 :8302-8306. 

15 A "selectable marker" or *Veporter marker" refers to a nucleotide sequence 

included in a gene transfer vector that has no therapeutic activity, but rather is 
included to allow for simpler preparation, manufacturing, characterization or testing 
of the gene transfer vector. 

A "specific binding agent" refers to a member of a specific binding pair of 

20 molecules wherein one of the molecules specifically binds to the second molecule 

through chemical and/or physical means. One example of a specific binding agent is 
an antibody directed against a selected antigda. 

By "subject" is meant any member of the subphylum chordata, including, 
without limitation, humans and other primates, including non-human primates such as 

25 chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, 
pigs, goats and horses; domestic mammals such as dogs and cats; laboratory animals 
including rodents such as mice, rats and guinea pigs; birds, including domestic, wild 
and game birds such as chickens, turkeys and other gallinaceous birds, ducks, geese, 
and the like. The term does not denote a particular age. Thus, both adult and 

30 newbom individuals are intended to be covered. The system described above is 

intended for use in any of the above vertebrate species, since the immune systems of 
all of these vertebrates operate similarly. 
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By "phannaceutically acceptable" or "phannacologically acceptable" is meant 
a material which is not biologicaUy or otherwise undesirable, i.e., the material may be 
administered to an individual in a formulation or composition without causing any 
undesirable biological effects or interacting in a deleterious manner with any of the 
componmts of the composition in which it is contained. 

By "physiological pH" or a "pH in the physiological range" is meant a pH in 
the range of approximately 7.2 to 8.0 inclusive, more typically in the range of 
approximately 7.2 to 7.6 inclusive. 

As used herein, "treatment" refers to any of (I) the prevention of infection or 
reinfection, as in a traditional vaccine, (ii) the reduction or elimination of symptoms, 
and (iii) the substantial or complete elimination of the pathogen in question. 
Treatment may be effected prophylactically (prior to infection) or therapeutically 
(following infection). 

"Lentiviral vector", and "recombinant lentiviral vector" refer to a nucleic acid 
construct which carries, and within certain embodiments, is capable of directing the 
expression of a nucleic acid molecule of interest. The lentiviral vector include at least 
one transcriptional promoter/enhancer or locus defining element(s), or other elements 
which control gene expression by other means such as altemate splicing, nuclear RNA 
export, post-translational modification of messoiger, or post-transcriptional 
modification of protein. Such vector constructs must also include a packaging signal, 
long terminal repeats (LTRS) or portion thereof, and positive and negative strand 
primer binding sites appropriate to the retrovims used (if these are not already present 
in the retroviral vector). Optionally, the recombinant lentiviral vector may also 
include a signal which directs polyadraylation, selectable markere such as Neo, TK, 
hygromycin, phleomycin, histidinol, or DHFR, as well as one or more restriction sites 
and a translation termination sequence. By way of example, such vectors typically 
include a 5* LTR, a tRNA binding site, a packaging signal, an origin of second strand 
DNA synthesis, and a 3*LTR or a portion thereof 

"Lentiviral vector particle" as utilized within the present invention refers to a 
lentivinis which carries at least one gene of interest. The retrovirus may also contain 
a selectable marker. The recombinant lentivinis is capable of reverse transcribing its 
genetic material (RNA) into DNA and incorporating this genetic material into a host 
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cell's DNA upon infection. Lentiviral vector particles may have a lentiviral mvelope, 
a non-lentiviral CTvelope (e.g., an ampho or VSV-G envelope), or a chimeric 
envelope. 

"Nucleic acid expression vector" or "Expression cassette" refers to an 
5 assembly which is capable of directing the expression of a sequence or gene of 

interest. The nucleic acid expression vector includes a promoter which is operably 
linked to the sequraces or gene(s) of interest. Other control elem^ts may be present 
as well. Expression cassettes described herein may be contained within a plasmid 
construct. In addition to the components of the expression cassette, the plasmid 

10 construct may also include a bacterial origin of replication, one or more selectable 

markers, a signal which allows the plasmid construct to exist as single-stranded DNA 
(e.g., a Ml 3 origin of replication), a multiple cloning site, and a "mammalian" origin 
of replication (e.g., a SV40 or adenovirus origin of replication). 

'Tackaging celF' refers to a cell which contains those elements necessary for 

1 5 production of infectious recombinant retrovirus which are lacking in a recombinant 
retroviral vector. Typically, such packaging cells contain one or more expression 
cassettes which are cs^able of expressing proteins which encode Gag, pol and env 
proteins. 

"Producer cell" or "vector producing cell" refers to a cell which contains all 
20 elements necessary for production of recombinant retroviral vector particles. 

2. Modes of Carrying Out the Invention 

Before describing the present invention in detail, it is to be understood that 
this invention is not limited to particular formulations or process parameto^ as such 
25 may, of course, vary. It is also to be understood that the terminology used herein is 

for the purpose of describing particular embodiments of the invention only, and is not 
intended to be limiting. 

Although a niunber of methods and materials similar or equivalent to those 
described herein can be used in the practice of the present inyoition, the preferred 
30 materials and methods are described herein. 
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2.1 Synthetic Expression Cassettes 

2.1.1 Modification of HIV-I-Type C Gag and ^wi' Nucleic Acid 
Coding Sequences 

One aspect of thepresent invention is the generation of HIV-1 type C Gag and 
Env protein coding sequences, and related sequences, having improved expression 
relative to the corresponding wild-type sequences. 

2.1.1.1. Modification of Gag Nucleic Acid Coding Sequences 
An exemplary embodiment of the present invention is illustrated herein by 
modifying the Gag protein wild-type sequences obtained from the AFl 10965 and 
AFl 10967 strains of HIV-1, subtype C. (see, for example, Korber et al. 
(l99S)Human Retroviruses and Aids. Los Alamos, New Mexico: Los Alamos 
National Laboratory; Novitsky et al. (1999) J. ViroL 73(5):4427-4432, for molecular 
cloning of various subtype C clones fix>m Botswana). Gag sequence obtained from 
other Type C HIV-1 variants may be manipulated in similar fashion following the 
teachings of the present specification. Such other variants include, but are not limited 
to. Gag protein encoding sequences obtained from the isolates of HIV-1 Type C, for 
example as described in Novitsky et al., (1999), supra; Myers et al., infra; Virology, 
3rd Edition (W.K. Joklik ed. 1988); Fundamental Virology, 2nd Edition (B.N. Fields 
and D.M. Knipe, eds. 1991); Virology, 3rd Edition (Fields, BN, DM Knipe, PM 
Howley, Editors, 1996, Lippincott-Raven, Philadelphia, PA and on the World Wide 
Web (Internet), for example at httPi/Zhiv-webJanl .^nv/cgi- 
bin/hivDB 3/i>ublic/wdb/ssampublic and http://hiv-weh.lan l.gov. 

First, the HTV-l codon usage pattern was modified so that the resulting nucleic 
acid coding sequence was comparable to codon usage found in highly expressed 
human genes (Example 1). The HIV codon usage reflects a high content of the 
nucleotides A or T of the codon-triplet. The effect of the HIV-1 codon usage is a high 
AT content in the DNA sequence that results in a decreased translation abihty and 
instability of the mRNA. In comparison, highly expressed human codons prefer the 
nucleotides G or C. The Gag coding sequences were modified to be comparable to 
codon usage found in highly expressed hiunan genes. 
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Second, there are inhibitory (or instability) elements (INS) located within ttie 
coding sequences of the Gag coding sequences. The RRE is a secondary RNA 
structure that interacts with the HTV encoded Rev-protein to overcome the expression 
down-regulating effects of the INS. To overcome the post-transcriptional activating 
5 mechanisms of RRE and Rev, the instability elements can be inactivated by 
introducing multiple point mutations that do not alter the reading frame of the 
encoded proteins. Subtype C Gag-encoding sequences having inactivated RRE sites 
are shown in Figures 1 (SEQ ID NO:3), 2 (SEQ ID NO:4), 5 (SEQ ID NO:20) and 6 
(SEQIDNO:26). 

10 Modification of the Gag polypeptide coding sequences results in improved 

expression relative to the wild-type coding sequences in a number of mammalian cell 
lines (as well as other types of cell lines, including, but not limited to, insect cells). 
Further, expression of the sequences results in production of virus-like particles 
(VLPs) by these cell lines (see below). 

15 

2.1.1.2 Modification of £/vk Nucleic Acid Coding Sequences 

Similarly, the present invention also includes modified Env proteins. Wild- 
type Env sequences are obtained firom the AF110968 and AFl 10975 strains of HIV-1, 
type C. (see, for example, Novitsky et al. (1999)7. ViroL 73(5):4427-4432, for 

20 molecular cloning of various subtype C clones firom Botswana). Env sequence 

obtained fi-om other Type C HIV-1 variants may be manipulated in similar fashion 
following the teachings of the present specification. Such other variants include, but 
are not limited to, Env protein encoding sequences obtained fix>m the isolates of HIV- 
1 Type C, described above. 

25 The codon usage pattern for Env was modified as described above for Gag so 

that the resulting nucleic acid coding sequence was comparable to codon usage found 
in highly expressed human genes. Experiments can be performed in support of the 
present invention to show that the synthetic Env sequences were capable of higher 
level of protein production relative to the native Env sequ^ces. 

30 Modification of the Env polypeptide coding sequences results in improved 

expression relative to the wild-type coding sequences in a number of mammalian cell 
lines (as well as other types of cell lines, including, but not limited to, insect cells). 

25 
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Similar Env polypq)tide coding sequences can be obtained, optimized and tested for 
improved expression from a variety of isolates, including tfiose described above for 
Gag. 

2.1.2 Further Modification of Sequences Including HTV-l Gag 
Nucleic Acid Coding Sequences 

Experiments can be perfomied to show that similar modifications of HTV-l 
Gag-protease and Gag-polymerase sequences also result in improved expression of 
the polyproteins, as well as, the production of VLPs fomied by polypeptides produced 
from such modified coding sequences. 

For the Gag-protease sequence, the changes in codon usage are ^ically 
restricted to the regions up to the -1 frameshift and starting again at the end of the 
Gag reading frame; however, regions within the frameshift translation region can be 
modified as well. Further, inhibitory (or instability) elements (INS) located within the 
coding sequences of the Gag-protease polypeptide coding sequence can be altered as 
well. 

For the Gag-polymerase sequence, the changes in codon usage can be similar 
to those for the Crag-protease sequence. 

In addition to polyproteins contaming HIV-related sequences, the Gag 
encoding sequences of the present invention can be fused to other polypeptides 
(creating chimeric polypeptides) for which an immunogenic response is desired. 

Furth^ sequences useful in the practice of the present invention include, but 
are not limited to, sequences encoding further viral epitopes/antigens {including but 
not limited to, HCV antigens (e.g.. El, E2; Houghton, M.., et ah, U.S. Patent No. 
5,714,596, issued February 3, 1998; Houghton, M.., et al., U.S. Patent No. 5,712,088, 
issued January 27, 1998; Houghton, M.., et al., U.S. Patent No. 5,683,864, issued 
November 4, 1997; Weiner, A J., et al., U.S. Patent No. 5,728,520, issued March 1 7, 
1998; Weiner, A. J., et al., U.S. Patent No. 5,766,845, issued June 16, 1998; Weiner, 
A.J., et al., U.S. Patent No. 5,670,152, issued September 23, 1997), HIV antigens 
(e.g., derived from tat, rev, nef and/or env)\ and sequences encoding tumor 
antigens/epitopes. Additional sequences are described below. Also, variations on the 
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orientation of the Gag and oth^ coding sequences, relative to each other, are 
described below. 

Gag, Gag-protease, and/or Gag-polymerase polypeptide coding sequraces can 
be obtained finom Type C HIV isolates, see, e.g,, Myers et al. Los Alamos Database, 
S Los Alamos National Laboratory, Los Alamos, New Mexico (1992); Myers et al.. 
Human Retroviruses and Aids, 1997 ^ Los Alamos, New Mexico: Los Alamos 
National Laboratory. Synthetic expression cassettes can be generated using such 
coding sequences as starting material by following die teachings of the present 
specification (e.g., see Example 1). 

10 Further, the synthetic expression cassettes of the present invention include 

related Gag polypeptide coding sequences having greater than 85%, preferably greater 
than 90%, more preferably greater than 95%, and most preferably greater than 98% 
sequence identity to the synthetic expression cassette sequences disclosed herein (for 
example. Figures 1, 2, 5 and 6 (SEQ ID NOs:3, 4, 20 and 21). 

15 The present invention also includes related Env polypeptide coding sequences 

having greater than 85%, preferably greater than 90%, more preferably greater than 
95%, and most preferably greater than 98% sequence identity to the sequences 
disclosed herein (for example. Figures 3 and 4, SEQ ID NOs:5-17). Various coding 
regions are indicated in Figures 3 and 4, for example in Figure 3 (AFl 10968), 

20 nucleotides 1-81 (SEQ ID NO:18) encode a signal peptide, nucleotides 82-1512 (SEQ 
ID NO:6) encode a gpl20 polypeptide, nucleotides 1513 to 2547 (SEQ ID NO:10) 
encode a gp41 polypeptide, nucleotides 82-2025 (SEQ ID NO:7) encode a gpl40 
polypeptide and nucleotides 82-2547 (SEQ ID NO:8) encode a gpl60 polypq>tide. 

25 2.1.3 Expression of Synthetic Sequences Encoding HlV-l Gag or 

Env AY^ii Related Polypeptides 
Synthetic Gag- and Env-encoding sequences (expression cassettes) of the 
present invention can be cloned into a number of different expression vectors to 
evaluate levels of expression and, in the case of Gag, production of VLPs. The 
30 synthetic DNA fragments for Env and Gag can be cloned into eucaryotic expression 
vectors, including, a transient expression vector, CMV-promoter-based mammaUan 
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vectors, and a shuttle vector for use in baculovirus expression systems. 
Corresponding wild-type sequences can also be cloned into the same vectors. 

These vectors can thra be transfected into a several different cell types, 
including a variety of mammalian cell lines,(293, RD, COS-7, and CHO, cell lines 
5 available, for example, fix)m the A.T.C.C.). The cell lines are then cultured under 
appropriate conditions and the levels of p24 (Gag) or, gpl60 or gpl20 (Env) 
expression in siq>ematants can be evaluated (Example 2). Env polypeptides include, 
but are not limited to, for example, native gpl60, oligomeric gpl40, monomeric 
gpl20 as well as modified sequences of these polypeptides. The results of these 

1 0 assays demonstrate that expression of synthetic Env, Gag and Gag-protease encoding 
sequences are significantly higher than corresponding wild-type sequences. 

Further, Western Blot analysis can be used to show that cells containing the 
synthetic Gag or Env expression cassette produce the expected protein at higher per- 
cell concentrations than cells containing the native expression cassette. The Gag and 

15 Env proteins can be seen in both cell lysates and supematants. The levels of 

production are significantly higher in cell supematants for cells transfected with the 
synthetic expression cassettes of the present invention. 

Fractionation of the supmiatants fit>m mammalian cells transfected with the 
synthetic Gag or Env expression cassette can be used to show that the cassettes 

20 provide superior production of both Gag and Env proteins and, in the case of Gag, 
VLPs, relative to the wild-type sequences. 

EfiBcient expression of these Gag- and/or Env-containing polypeptides in 
mammalian cell lines provides the following benefits: the polypeptides are fiw of 
baculovirus contaminants; production by established methods £q>proved by the FDA; 

25 increased purity; greater yields (relative to native coding sequences); and a novel 
method of producing the Gag- and/or Env-containing polypeptides in CHO cells 
which is not feasible in the absence of the increased expression obtained using the 
constructs of the present invention. Exemplary Mammalian cell lines include, but are 
not limited to, BHK, VERO, HT1080, 293, 293T, RD, COS-7, CHO, Jurkat, HUT, 

30 SUPT, C8166, MOLT4/clone8, MT-2, MT-4, H9, PMl, CEM, and CEMX174, such 
cell lines are available, for example, firom the A.T.C.C.). 
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A synthetic Gag e3q>ressioii cassette of the present invention will also exhibit 
high levels of expression and VLP production when transfected into insect cells. 
Synthetic Env expression cassettes also demonstrate high levels of expression in 
insect cells. Further, in addition to a higher total protein yield, the final product from 
S the synthetic polyp^tides consistently contains lower amounts of contaminating 
baculovirus proteins than the final product Grom the native Gag or Env. 

Further, synthetic Gag and Env expression cassettes of the present invention 
can also be introduced into yeast vectors which, in turn, can be transformed iato and 
efficiently expressed by yeast cells {Saccharomyces cerevisea; using vectors as 
10 described in Rosenberg, S. and Tekamp-Olson, P., U.S. Patent No. RE35,749, issued, 
March 17, 1998). 

In addition to the mammalian and insect vectors, the synthetic expression 
cassettes of the present invention can be incorporated into a variety of expression 
vectors using selected expression control elements. Appropriate vectors and control 

15 elements for any given cell type can be selected by one having ordinary skill in the art 
in view of the teachings of the present specification and information known in the art 
about expression vectors. 

For example, a synthetic Gag or &iv expression cassette can be inserted into a 
vector which includes control elements operably linked to the desired coding 

20 sequence, which allow for the expression of the gene in a selected cell-t3rpe. For 
example, typical promoters for mammalian cell expression include the SV40 early 
promoter, a CMV promoter such as the CMV innmediate early promoter (a CMV 
promoter can include intron A), RSV, HTV-Ltr, the mouse mammary tumor virus 
LTR promoter (MMLV-ltr), the adenovims major late promoter (Ad MLP), and the 

25 herpes simplex virus promoter, among others. Other nonviral promoters, such as a 
promoter derived from the murine metallothionein gene, will also find use for 
mammalian expression. Typically, transcription termination and polyadenylation 
sequences will also be present, located 3' to the translation stop codon. Preferably, a 
sequence for optimization of initiation of translation, located 5' to the coding 

30 sequence, is also present. Examples of transcription terminator/polyadenylation 

signals include those derived from SV40, as described in Sambrook, et al., supra^ as 
well as a bovine growth hormone terminator sequence. Introns, containing splice 
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donor and accqitor sites, may also be designed into the constructs for use with the 
present invention (Chapman et al., Nttc. Adds Res. (1991) 19:3979-3986). 

Enhancer elements may also be used herein to increase ei^ression levels of 
the mammalian constracts. Examples include the S V40 early gene oihancer, as 
5 described in Dijkema et al., EMBOJ. (1985) 4:761, the enhancer/promoter derived 
from the long terminal repeat (LTR) of the Rous Sarcoma Virus, as described in 
Gorman et al., Proc Natl. Acad. Sci. USA (1982b) 79:6777 and elements derived 
from human CMV, as described in Boshart et al.. Cell (1985) 41 :521, such as 
elonents included in the CMV intron A sequence (Chapman et al., Nuc. Acids Res. 
10 (1991)19:3979-3986). 

The desired synthetic Gag or Env polypeptide encoding sequences can be 
cloned into any number of commercially available vectors to generate expression of 
the polypeptide in an appropriate host system. These systems include, but are not 
limited to, the following: baculovirus expression {Reilly, P.R., et al., Baculovirus 
15 Expression Vectors: A Laboratory Manual (%QQ7\- Rpampc ^ ni 

Biotechniques 11:378 (1991); Pharmingen; Clontech, Palo Alto, CA)}, vaccinia 
expression {Earl, P. L., et al, "Expression of proteins in mammahan cells using 
vaccinia" In Current Protocols in Molecular Biology (F. M. Ausubel, et al. Eds.), 
Greene Publishinig Associates & Wiley hitoscieaice, New Yoric (1991); Moss, B., et 
20 a/., U.S. Patent Number 5,135,855, issued 4 August 1992}, expression in bacteria 
{Ausubel, P.M., et al.. Current P rotocols in Molecular Biology. John Wiley 
and Sons, Inc., Media PA; Clontech} , e:q)ression in yeast {Rosenborg, S. and 
Tekamp-Olson. P., U.S. Patent No. RE35,749, issued, March 17, 1998; Shuster, JJL, 
U.S. Patent No. 5,629,203, issued May 13, 1997; GeUissen, G., et al., Antonie Van 
25 Leeuwenhoek, ^(l-2):79-93 (1992); Romanos, M.A., et al.. Yeast 8(6):423-488 
(1992); Goeddel, D.V., Methods in Enzymolosv 185 (1990): Guthrie, C, and G.R. 
Fink, Methods in Enzymologyl9A (1991)}, expression in mammalian cells {Clontech; 
Gibco-BRL, Ground Island, NY; e.g., Chinese hamster ovary (CHO) cell lines 
(Haynes, J., et al., Nuc. Acid. Res. 11:687-706 (1983); 1983, Lau, Y.F., et al, Mol. 
30 Cell. Biol. 4:1469-1475 (1984); Kaufinan, R. J., "Selection and coamplification of 

heterologous genes in mammalian cells," in Methods in Enzymology, vol. 185, pp537- 
566. Academic Press, Inc., San Diego CA (1991)}, and expression in plant cells 

30 
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{plant cloning vectors, Clontech Laboratories, Inc., Palo Alto, CA, and Pharmacia 
LKB Biotechnology, Inc., Pistcataway, NJ; Hood, E., et al, J. Bacterial. 168:1291- 
1301 (1986); Nagel, R., et al,, FEMS Microbiol. Lett. 67:325 (1990); An, et al, 
"Binary Vectors", and others in Plant Molecular Biology Manual A3:l-19 (1988); 
5 Mild, B.L.A., et al, pp.249-265, and others in Plant DNA Infectious Apents (Hohn, 
T., et aly eds.) Springer- Verlag, Wien, Austria, (1987); Plant Molecular Biology: 
Essential Techniques^ P.G. Jones and J.M, Sutton, New York, J. Wiley, 1997; 
Miglani, Gurbachan Dictionary of Plant Genetics and Molecular Biology, New York, 
Food Products Press, 1998; Henry, R. J., Practical Applications of Plant Molecular 

1 0 Biology, New York, Chapman & Hall, 1 997} . 

Also included in the invention is an expression vector, containing coding 
sequences and expression control elements which allow expression of the coding 
regions in a suitable host. The control elements generally include a promoter, 
translation initiation codon, and translation and transcription termination sequences, 

15 and an insertion site for introducing the insert into the vector. Translational control 
elements have been reviewed by M. Kozak (e.g., Kozak, M., Mamm. Genome 
7(8):563-574, 1996; Kozak, M., Biochimie 76(9):8 15-821, 1994; Kozak, M., J Cell 
Biol 108(2):229-241, 1989; Kozak, M., and Shatkin, A.J., Methods Enzymol 
60:360-375, 1979). 

20 Expression in yeast systems has the advantage of commercial production. 

Recombinant protein production by vaccinia and CHO cell line have the advantage of 
being mammalian expression systems. Further, vaccinia virus expression has several 
advantages including the following: (i) its wide host range; (ii) faithful post- 
transcriptional modification, processing, folding, transport, secretion, and assembly of 

25 recombinant proteins; (iii) high level expression of relatively soluble recombinant 
proteins; and (iv) a large capacity to accommodate foreign DNA. 

The recombinantly expressed polypeptides from synthetic Gag- and/or Env- 
encoding expression cassettes are typically isolated from lysed cells or culture media. 
Purification can be carried out by methods known in the art including salt 

30 fractionation, ion exchange chromatography, gel filtration, size-exclusion 

chromatography, size-fractionation, and affinity chromatography. Immunoaffinity 
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chromatography can be ^ployed using antibodies generated based on, for example. 
Gag or Env antigens. 

Advantages of expressing the Gag- and/or Env-containing proteins of the 
present invration using mammalian cells include, but are not limited to, the 
5 following: well-established protocols for scale-up production; the ability to produce 
VLPs; cell lines are suitable to meet good manufacturing process (GMP) standards; 
culture conditions for mammalian cells are known in the art. 

Various forms of the different embodiments of the invention, described herein, 
may be combined. 

10 

2.2 Production of Virus-like Particles and Use of the 

Constructs of the Present Invention to create Packaging 
CELL lines. 

The group-specific antigens (Gag) of human immunodeficiency virus type-1 
15 (HIV-1) self-assemble into noninfectious virus-like particles (VLP) that are released 
from various eucaryotic cells by budding (reviewed by Freed, E.O., Virology 251:1- 
15, 1998). The synthetic expression cassettes of the present invention provide 
e£5cient means for the production of HTV-Gag virus-like particles (VLPs) using a 
variety of different cell types, including, but not limited to, mammalian cells. 
20 Viral particles can be used as a matrix for the proper presentation of an antigen 

entrapped or associated therewith to the immune system of the host. 

2.2.1 VLP Production using the synthetic expression cassettes of 

THE PRESENT INVENTION 

25 Experiments can be performed iii support of the present invention to 

demonstrate that the synthetic expression cassettes of the present invention provide 
superior production of both Gag proteins and VLPs, relative to native Gag coding 
sequences. Further, electron microscopic evaluation of VLP production can show that 
free and budding immature virus particles of the expected size are produced by cells 

30 containing the synthetic expression cassettes. 

Using the synthetic expression cassettes of the present invention, rather than 
native Gag coding sequences, for the production of virus-like particles provide several 
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advantages. First, VLPs can be produced in enhanced quantity making isolation and 
purification of the VLPs easier. Second, VLPs can be produced in a variety of cell 
types using the synthetic expression cassettes, in particular, mammalian cell lines can 
be used for VLP production, for example, CHO cells. Production using CHO cells 
5 provides (i) VLP formation; (ii) correct myristylation and budding; (iii) absence of 
non-mamallian cell contaminants (e.g., insect viruses and/or cells); and (iv) ease of 
purification. The synthetic expression cassettes of the present invention are also useful 
for enhanced expression in cell-types other than mammalian cell lines. For example, 
infection of insect cells with baculovirus vectors encoding the synthetic expression 

10 cassettes results in higher levels of total Gag protein yield and higher levels of VLP 
production (relative to wild-type coding sequences). Further, the final product from 
insect cells infected with the baculovirus-Gag synthetic expression cassettes 
consistently contains lower amounts of contaminating insect proteins than the final 
product when wild-type coding sequences are used. 

1 5 VLPs can spontaneously form when the particle-forming polypeptide of 

interest is recombinantly expressed in an appropriate host cell. Thus, the VLPs 
produced using the synthetic expression cassettes of the present invention are 
conveniently prepared using recombinant techniques. As discussed below, the Gag 
polypeptide encoding synthetic expression cassettes of the present invention can 

20 include other polypeptide coding sequences of interest (for example, HIV protease, 
HIV polymerase, HCV core; Env; synthetic Env; see. Example 1). Expression of 
such synthetic expression cassettes yields VLPs comprising the Gag polypq>tide, as 
well as, the polypeptide of interest. 

Once coding sequences for the desired particle-forming polypeptides have 

25 been isolated or synthesized, they can be cloned into any suitable vector or replicon 
for expression. Numerous cloning vectors are known to those of skill in the art, and 
the selection of an appropriate cloning vector is a matter of choice. See, generally, 
Sambrook et al, supra. The vector is then used to transform an appropriate host cell. 
Suitable recombinant expression systems include, but are not limited to, bacterial, 

30 mammalian, baculovirus/insect, vaccinia, Semliki Forest virus (SFV), Alphavirases 
(such as, Sindbis, Venezuelan Equine Encephalitis (VEE)), mammalian, yeast and 
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Xenopus expression systems, well known in the art. Particularly preferred expression 
systems are mammalian cell lines, vaccinia, Sindbis, insect and yeast systems. 

For example, a number of mammalian cell lines are known in the art and 
include immortalized cell lines available fix)m the American Type Culture Collection 
5 (A.T.C.C.), such as, but not limited to, Chinese hamstar ovary (CHO) cells, HeLa 
cells, baby hamstw^ kidney (BHK) cells, monkey kidney cells (COS), as well as 
others. Similarly, bacterial hosts such as E. coli. Bacillus subtilis^ and Streptococcus 
spp.y will find use with the present expression constructs. Yeast hosts useful in flie 
present invention include inter alia, Saccharomyces cerevisiae, Candida albicanSy 

10 Candida maltosa, Hansenula polymorpha, Kluyveromyces /ragilis, Kluyveromyces 
lactis, Pichia guillerimondii, Pichia pastoris, Schizosaccharomyces pombe and 
Yarrowia lipolytica. Insect cells for use with baculovirus expression vectors include, 
inter alia, Aedes aegypti, Autographa calif arnica, Bombyx mori, Drosophila 
melanogaster Spodopterafrugiperda, and Trichoplusia ni. See, e.g.. Summers and 

1 5 Smith, Texas Agricultural Experiment Station Bulletin No. 1555 ( 1 987). 

Viral vectors can be used for the production of particles in eucaryotic cells, 
such as those derived from the pox family of viruses, including vaccinia virus and 
avian poxvirus. Additionally, a vaccinia based infection/transfection system, as 
described in Tomei et al., y. Virol (1993) 67:4017-4026 and Selby et al., J. Gen, 

20 Virol. (1 993) 74: 1 1 03- 1 1 1 3, will also find use with the present invention. In this 

system, cells are first infected in vitro with a vaccinia virus recombinant that encodes 
the bacteriophage T7 RNA polymerase. This polymerase displays exquisite 
specificity in that it only transcribes templates bearing T7 promoters. FoUowmg 
infection, cells are transfected with the DNA of interest, driven by a T7 promoter. 

25 The polymerase expressed in the cytoplasm fcom the vaccinia virus recombinant 

transcribes the transfected DNA into RNA which is then translated into protein by the 
host translational machinery. Alternately, T7 can be added as a piuified protein or 
enzyme as in the "Progenitor" system (Studier and Moffatt, J. MoL BioL (1986) 
189: 1 1 3-1 30). The method provides for high level, transient, cytoplasmic production 

30 of large quantities of RNA and its translation product(s). 

Depending on the expression system and host selected, the VLPS are produced 
by growing host cells transformed by an expression vector under conditions whereby 
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the particle-fonning polypeptide is expressed and VLPs can be formed. The selection 
of the ^propriate growth conditions is within the sldll of the art. If the VLPs are 
formed intracellularly, the cells are then disrupted, using chemical, physical or 
mechanical means, which lyse the cells yet keep the VLPs substantially intact. Such 
5 methods are known to those of skill in the art and are described in, e.g.. Protein 

Purification Applications: A Practical Approach^ (E.L.V. Harris and S. Angal, Eds., 
1990). 

The particles are then isolated (or substantially purified) using methods that 
preserve the integrity thereof, such as, by gradient centrifugation, e.g., cesium 

10 chloride (CsCl) sucrose gradients, pelleting and the like (see, e.g., Kimbauer et al. J. 
ViroL (1993) 67:6929-6936), as well as standard purification techniques including, 
e.g., ion exchange and gel filtration chromatography. 

VLPs produced by ceils containing the synthetic expression cassettes of the 
present invention can be xxsed to elicit an immune response when administered to a 

15 subject. One advantage of the present invention is that VLPs can be produced by 

mammalian cells czirrying the synthetic expression cassettes at levels previously not 
possible. As discussed above, the VLPs can comprise a variety of antigens in 
addition to the Gag polypeptide (e.g.. Gag-protease, Gag-polymerase, Env, synthetic 
Env, etc.). Purified VLPs, produced using the synthetic expression cassettes of the 

20 present invention, can be administered to a vertebrate subject, usually in the form of 
vaccine compositions. Combination vaccines may also be used, where such vaccines 
contain, for example, an adjuvant subunit protein (e.g., Env). Administration can take 
place using the VLPs formulated alone or formulated with other antigens. Furttier, 
the VLPs can be administered prior to, concurrent with, or subsequent to, delivery of 

25 the synthetic expression cassettes for DNA inmnmization (see below) and/or delivery 
of other vaccines. Also, the site of VLP administration may be the same or different 
as other vaccine compositions that are being administered. Gene delivery can be 
accomplished by a number of methods including, but are not limited to, immunization 
with DNA, alphavirus vectors, pox virus vectors, and vaccinia viras vectors. 

30 VLP immune-stimulating (or vaccine) compositions can include various 

excipients, adjuvants, carriers, auxiliary substances, modulating agents, and the like. 
The immune stimulating compositions will include an amount of the VLP/antigen 
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sufficient to mount an immunological response. An appropriate effective amount can 
be determined by one of skill in the art. Such an amount will fall in a relatively broad 
range that can be determined through routine trials and will generally be an amount on 
the order of about 0.1 ng to about 1000 |ig, more preferably about 1 ng to about 300 
5 Jig, of VLP/antigen. 

A carrier is optionally present which is a molecule that does not itself induce 
the production of antibodies hamiful to the individual receiving the composition. 
Suitable carriers are typically large, slowly metabolized macromolecules such as 
proteins, polysaccharides, polylactic acids, polyglycollic acids, polymeric amino 
1 0 acids, amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), 
and inactive virus particles. Examples of particulate carriers include those derived 
from polymethyl methacrylate polymers, as well as microparticles derived from 
poly(lactides) and poly(lactide-co-glycolides), known as PLG. See, e.g., Jeffery et al., 
Pharm. Res. (1993) 10:362-368; McGee JP, et al., J Microencapsul. 14(2): 197-210, 

15 1997; O-Hagan DT, et al.. Vaccine ll(2):149-54, 1993. Such carriers are well known 
to those of ordinary skill in the art. Additionally, these carriers may function as 
immunostimulating agents ("adjuvants"). Furthermore, the antigen may be 
conjugated to a bacterial toxoid, such as toxoid from diphtheria, tetanus, cholera, etc., 
as well as toxins derived from E. coli. 

20 Adjuvants may also be used to enhance the effectiveness of the compositions. 

Such adjuvants include, but are not limited to: (1) aluminum salts (alum), such as 
aluminum hydroxide, aluminum phosphate, aluminum sulfate, etc.; (2) oil-in-water 
emulsion formulations (with or without other specific immunostimulating agents such 
as muramyl peptides (see below) or bacterial cell wall components), such as for 

25 example (a) MF59 (International Publication No. WO 90/1 4837), containing 5% 

Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts 
of MTP-PE (see below), although not required) formulated into submicron particles 
using a microfluidizer such as Model 1 lOY microfluidizer (Microfluidics, Newton, 
MA), (b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked 

30 polymer L121, and thr-MDP (see below) either microfluidized into a submicron 
emulsion or vortexed to generate a larger particle size emulsion, and (c) Ribi'TM 
adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 
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0.2% Tween 80, and one or more bacterial cell wall components firom the group 
consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell 
wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) s^onin adjuvants, such 
as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particle 
5 generated therefrom such as ISCOMs (inununostimulating complexes); (4) Complete 
Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant (IFA); (5) cytokines, such 
as interleukins (IL-1, IL-2, etc.), macrophage colony stimulating factor (M-CSF), 
tumor necrosis factor (TNF), etc.; (6) oligonucleotides or polymeric molecules 
CTcoding immunostimulatory CpG mofifs (Davis, H.L., et al., J. Immunology 

10 160:870-876, 1998; Sato, Y. et al.. Science 273:352-354, 1996) or complexes of 

antigens/oligonucleotides {Polymeric molecules include double and single stranded 
RNA and DNA, and backbone modifications thereof, for example, 
methylphosphonate linkages; or (7) detoxified mutants of a bacterial ADP- 
ribosylating toxin such as a cholera toxin (CT), a pertussis toxin (PT), or an E. coli 

15 heat-labile toxin (LT), particularly LT-K63 (where lysine is substituted for the wild- 
type amino acid at position 63) LT-R72 (where arginine is substituted for the wild- 
type amino acid at position 72), CT-S109 (where serine is substituted for the wild- 
type amino acid at position 109), and PT-K9/G129 (where lysine is substituted for the 
wild-type amino acid at position 9 and glycine substituted at position 129) (see, e.g., 

20 International Publication Nos. W093/1 3202 and W092/19265); and (8) other 

substances that act as immunostimulating agents to enhance the effectiveness of the 
composition. Further, such polymeric molecules include alternative polymer 
backbone stmctures such as, but not limited to, polyvinyl backbones (Pitha, Biochem 
Biophys Acta, 204:39, 1970a; Pitha, Biopolymers, 9:965, 1970b), and morpholino 

25 backbones (Summerton, J., et aL, U.S. Patent No. 5,142,047, issued 08/25/92; 

Summerton, J., et al., U.S. Patent No. 5,1 85,444 issued 02/09/93). A variety of other 
charged and uncharged polynucleotide analogs have been reported. Numerous 
backbone modifications are known in the art, including, but not limited to, uncharged 
linkages (e.^., methyl phosphonates, phosphotriesters, phosphoamidates, and 

30 carbamates) and charged Unkages (e.g., phosphorothioates and 

phosphorodithioates).}; and (7) other substances that act as immunostimulating agents 



37 



wo 00/39304 



PCTAJS99/31273 



to enhance the effectiveness of the VLP immune-stiinulating (or vaccine) 
composition. Alum, CpG oligonucleotides, and MF59 are preferred. 

Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L- 
threonyl-D-isoglutamine (thr-MDP), N-acteyl-normuramyl-L-alanyl-D-isogluatme 
5 (nor-MDP), N-acetylmuramyl-I^alanyl-D-isogIuatminyl-L-alanine-2-(r-2*- 

dipahnitoyl-5w-gIycero-3-huydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

Dosage treatment with the VLP composition may be a single dose schedule or 
a multiple dose schedule. A multiple dose schedule is one in which a primary course 
of vaccination may be with 1-10 separate doses, followed by other doses given at 
1 0 subsequent time intervals, chosen to maintain and/or reinforce the inunune response, 
for example at 1-4 months for a second dose, and if needed, a subsequent dose(s) after 
several months. The dosage regimen will also, at least in part, be determined by the 
need of the subject and be dependent on the judgment of the practitioner. 

If prevention of disease is desired, the antigen carrying VLPs are generally 
1 5 administered prior to primary infection with the pathogen of interest. If treatment is 
desired, e.g., the reduction of symptoms or recurrences, the VLP compositions are 
generally administered subsequent to primary infection. 

2.2.2 USING THE SYNTHETIC EXPRESSION CASSETTES OF THE PRESENT 
20 INVENTION TO CREATE PACKAGING CELL LINES 

A number of viral based systems have been developed for use as gene transfer 
vectors for mammalian host cells. For example, retroviruses (in particular, lentiviral 
vectors) provide a convenient platform for gene delivery systems. A coding sequence 
of interest (for example, a sequmce useful for gene therapy applications) can be 

25 inserted into a gene deUvery vector and packaged in retroviral particles using 

techniques known in the art. Recombinant virus can then be isolated and delivered to 
cells of the subject either in vivo or ex vivo. A number of retroviral systems have been 
described, including, for example, the following: (U.S. Patent No. 5,219,740; Miller 
et al. (1989) BioTechniques 7:980; MiUer, A.D. (1990) Human Gene Therapy 1:5; 

30 Scarpa et al. (1991) Fira/agy 180:849; Bums et al. (1993) Proc, Natl. Acad. Sci, USA 
90:8033; Boris-Lawrie et al. (1993) Cun Opin, Genet. Develop. 3:102; GB 2200651; 
EP 0415731; EP 0345242; WO 89/02468; WO 89/05349; WO 89/09271 ; WO 
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90/02806; WO 90/07936; WO 90/07936; WO 94/03622; WO 93/25698; WO 
93/25234; WO 93/1 1230; WO 93/10218; WO 91/02805; in U.S. 5,219,740; U.S. 
4,405,712; U.S. 4,861,719; U.S. 4,980,289 and U.S. 4,777,127; in U.S. Serial No. 
07/800,921; and in Vile (1993) Qwccr^es 53:3860-3864; Vile (1993) Cancer Res 
5 53:962-967; Ram (1993) Cancer Res 53:83-88; Takamiya (1992) JNeurosci Res 
22:493-503; Baba (1993)yAfe«rasu»^ 29:729-735; Mann (1983) Ce// 23:153; Cane 
(1984) Proc Natl Acad Sci USA Sl;6349; and NGUer (1990) Human Gene Therapy \. 

In other embodimoits, gene transfer vectors can be constructed to oicode a 
cytokine or other immunomodulatory molecule. For example, nucleic acid sequences 

10 encoding native IL-2 and gamma-intoieron can be obtained as desoibed in US Patent 
Nos. 4,738,927 and 5,326,859, respectively, while useful muteins of these proteins 
can be obtained as described in U.S. Patent No. 4,853,332. Nucleic acid sequences 
encoding the short and long forms of mCSF can be obtained as described in US Patent 
Nos. 4,847,201 and 4,879,227, respectively. In particular aspects of the invention, 

15 retroviral vectors expressing c5rtokine or immunomodulatory genes can be produced 
as described herein (for example, employing the packaging cell lines of the present 
invention) and in International Application No. PCT US 94/02951, entitled 
"Compositions and Methods for Caacet Inununother:q)y." 

Examples of suitable immunomodulatory molecules for use herein include the 

20 following: IL- 1 and IL-2 (Karupiah et al. ( 1 990) J. Immunology 144:290-298, Weber 
et al. (1987)y. Exp. Med. 166:1716-1733. Gansbacher et al. (1990) J. Exp. Med. 
172:1217-1224. and U.S. Patent No. 4,738,927); IL-3 and IL-4 (Tepper et al. (1989) 
Cell 57:503-5 12, Golumbek et al. (1991) Science 254:713-716, and U.S. Patent No. 
5,017,691); IL-5 and IL-6 (Brakenhof et al. (1987)7. Immunol. 139:41 16-4121, and 

25 International Publication No. WO 90/06370); IL-7 (U.S. Patent No. 4,965,1 95); IL-8, 
IL-9, IL-10, IL-1 1, IL-12, and IL-13 {Cytokine Bulletin, Summer 1994); IL-14 and 
IL-1 5; alpha interferon (Pinter et al. (1 991 ) Drugs 42:749-765, U.S. Patent Nos. 
4,892,743 and 4,966,843, International Publication No. WO 85/02862, Nagata et al. 
(1980) Nature 284:316-320, Familletti et al. (1981) Methods in Em. 78:387-394, Twu 

30 et al. (1989) Proc. Natl. Acad Sci. USA 86:2046-2050, and Faktor et al. (1 990) 
Oncogene 5:867-872); beta-interferon (Seif et al. (1991) 7. Virol. 65:664-671); 
gamma-interferons (Radford et al. (1991) The American Society ofHepatology 
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20082015, Watanabe et al. (1989) Proc. Natl Acad, ScL USA §6:9456-9460, 
Gansbacher et al, (1990) Cancer Research 50:7820-7825, Maio et al. (1989) Can. 
Immunol Immunother, 30:34-42, and U.S. Patent Nos. 4,762,791 and 4,727,138); G- 
CSF (U.S. Patent Nos. 4,999,291 and 4,810,643); GM-CSF (International Publication 
5 No. WO 85/04188). 

Immunomodulatory factors may also be agonists, antagonists, or ligands for 
these molecules. For example, soluble forms of receptors can often behave as 
antagonists for these types of factors, as can mutated forms of the factors themselves. 
Nucleic acid molecules that encode the above-described substances, as well as 

10 other nucleic acid molecules that are advantageous for use within the present 

invention, may be readily obtained from a variety of sources, including, for example, 
depositories such as the American Type Culture Collection, or from commercial 
sources such as British Bio-Technology Limited (Cowley, Oxford England). 
Representative examples include BBG 12 (containing the GM-CSF gene coding for 

15 the mature protein of 1 27 amino acids), BBG 6 (which contains sequences encoding 
gamma interferon), A.T.C.C. Deposit No. 39656 (which contains sequences encoding 
TNF), A.T.C.C. Deposit No. 20663 (which contains sequences encoding alpha- 
interferon), A.T.C.C. Deposit Nos. 31902, 31902 and 39517 (which contain 
sequences encoding beta-interferon), A.T.C.C. Deposit No. 67024 (which contains a 

20 sequence which encodes Interleukin-lb), A.T.C.C. Deposit Nos. 39405, 39452, 

39516, 39626 and 39673 (which contain sequences encoding Interleukin-2), A.T.C.C. 
Deposit Nos. 59399, 59398, and 67326 (which contain sequences encoding 
Interleukin-3), A.T.C.C. Dq)osit No. 57592 (which contains sequences encodmg 
Interleukin-4), A.T.C.C. Deposit Nos. 59394 and 59395 (which contain sequences 

25 encoding IntCTleukin-5), and A.T.C.C. Deposit No. 671 53 (which contains sequences 
encoding Interleukin-6). 

Plasmids containing cytokine genes or immunomodulatory genes 
(International Publication Nos. WO 94/02951 and WO) can be digested with 
appropriate restriction ^izymes, and DNA fragments containing the particular gene of 

30 interest can be ins^ed into a gene transfer vector using standard molecular biology 
techniques. {See, e.g,, Sambrook et al., supra., or Ausbel et al. (eds) Current 
Protocols in Molecular Biology^ Greene Publishing and Wiley-Interscience). 

40 



wo 00/39304 



PCTAJS99/31273 



Polynucleotide sequences coding for the above-described molecules can be 
obtained using recombinant methods, such as by screening cDNA and genomic 
libraries from cells expressing the gene, or by deriving the g^e from a vector known 
to include the same. For example, plasmids which contain sequences that encode 
S altered cellular products may be obtained from a depository such as the A.T.C.C., or 
fix>m conunercial sources. Plasmids containing the nucleotide sequences of interest 
can be digested with ^propriate restriction enzymes, and DNA fragments containing 
the nucleotide sequences can be inserted into a gene transfer vector using standard 
molecular biology techniques. 

10 Altonatively, cDNA sequences for use with the present invention may be 

obtained from cells which express or contain the sequences, using standard 
techniques, such as phenol extraction and PGR of cDNA or genomic DNA. See, e.g., 
Sambrook et al., supra^ for a description of techniques used to obtain and isolate 
DNA. Briefly, mRNA from a cell which expresses the gene of interest can be reverse 

1 5 transcribed with reverse transcriptase using oligo-dT or random primers. The single 
stranded cDNA may then be amplified by PGR (see U.S. Patent Nos. 4,683,202, 
4,683,195 and 4,800,159, see also PCR Technology: Principles and Applications for 
DNA Amplification^ Erlich (ed.), Stockton Press, 1989)) using oligonucleotide primers 
complementary to sequences on either side of desired sequences. 

20 The nucleotide sequence of interest can also be produced synthetically, rather 

than cloned, using a DNA synthesizer (e.g., an Applied Biosystems Model 392 DNA 
Synthesizer, available from ABI, Foster City, California). The nucleotide sequence 
can be designed with the iqppropriate codons for the expression product desired. The 
complete sequence is assembled from overleaping oligonucleotides prepared by 

25 standard methods and assembled into a complete coding sequence. See, e.g., Edge 
(1981) Nature 292:756; Nambair et al. (1984) Science 223:1299; Jay et al. (1984)7. 
Biol, Chem. 259:6311. 

The synthetic expression cassettes of the present invention can be employed in 
the construction of packaging cell lines for use with retroviral vectors. 

30 One type of retrovirus, the murine leukemia virus, or "MLV", has been widely 

utilized for gene therapy applications (see generally Mann et al. (Cell 33:153, 1993), 
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Cane and Mulligan (Proc, Natl. Acad. Sci. USA 81 :6349, 1984), and Miller et al.. 
Human Gene 2lerapy 1:5-14,1990. 

Lentiviral vectors typically, comprise a 5' lentiviral LTR, a tRNA binding site, 
a packaging signal, a promoter operably linked to one or more genes of interest, an 
5 origin of second strand DNA synthesis and a 3' lentiviral LTR, wherein the lentiviral 
vector contains a nuclear transport element The nuclear transport element may be 
located either upstream (5*) or downstream (3') of a coding sequrace of interest (for 
example, a synthetic Gag or Env expression cassette of the present mvention). Within 
certain embodiments, the nuclear transport element is not RRE. Within one 

10 embodiment the packaging signal is an extended packaging signal. Within other 

embodiments the promoter is a tissue specific promoter, or, alternatively, a promoter 
such as CMV. Within other embodiments, the lentiviral vector further comprises an 
internal ribosome entry site. 

A wide variety of lenti viruses may be utilized within the context of the present 

15 invention, including for example, lenti viruses selected fh>m the group consisting of 
HIV, HIV-1, HIV-2, FIV and SIV. 

In one embodiment of the present invention synthetic Gag-polymerase 
expression cassettes are provided comprising a promoter and a sequence encoding 
synthetic Gag-polymerase and at least one of vpr, vpu, nef or vif, wherein the 

20 promoter is operably linked to Gag-polymerase and vpr, vpu, nef or vif 

Within yet another aspect of the invention, host cells (eg., packaging cell 
lines) are provided which contain any of the expression cassettes described herein. 
For example, within one aspect packaging cell line are provided comprising an 
expression cassette that comprises a sequence encoding synthetic Gag-polymerase, 

25 and a nuclear transport element, wherein the promoter is operably link^ to the 
sequence encoding Gag-polymerase. Packaging cell lines may further comprise a 
promoter and a sequence encoding tat, rev, or an envelope, wherein the promoter is 
operably linked to the sequence encoding tat, rev, Env or modified Env proteins. The 
packaging cell line may finrther comprise a sequence mcoding any one or more of nef, 

30 vif, vpu or vpr. 

In one embodiment, the expression cassette (carrying, for example, the 
synthetic Gag-polymerase) is stably integrated. The packaging cell line, upon 
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introduction of a lentiviral vector, typically produces particles. The promoter 
regulating expression of the synthetic expression cassette may be inducible. 
Typically, the packaging cell line, upon introduction of a lentiviral vector, produces 
particles that are essentially free of replication competent virus. 
5 Packaging cell lines are provided comprising an expression cassette which 

directs the expression of a synthetic Gag-polymerase gene or comprising an 
expression cassette which directs the expression of a synthetic Env genes described 
herein. (See, also, Andre, S., et al.. Journal of Virology 72(2): 1497-1 503, 1998; Haas, 
J., et al.. Current Biology 6(3):3 15-324, 1996) for a description of other modified Env 

10 sequences). A lentiviral vector is introduced into the packaging cell line to produce a 
vector producing cell line. 

As noted above, lentiviral vectors can be designed to carry or express a 
selected gene(s) or sequences of interest. Lentiviral vectors may be readily 
constructed from a wide variety of lentiviruses (see RNA Tumor Viruses, Second 

15 Edition, Cold Spring Harbor Laboratory, 1985). Representative examples of 

lentiviruses included HIV, HIV-1, HIV-2, FIV and SIV. Such lentiviruses may either 
be obtained from patient isolates, or, more preferably, from depositories or collections 
such as the American Type Culture Collection, or isolated from known sources using 
available techniques. 

20 Portions of the lentiviral gene delivery vectors (or vehicles) may be derived 

from different viruses. For example, in a given recombinant lentiviral vector, LTRs 
may be derived from an HIV, a packaging signal from SiV, and an origin of second 
strand synthesis from HrV-2. Lentiviral vector constmcts may comprise a 5* lentiviral 
LTR, a tRNA binding site, a packaging signal, one or more heterologous sequences, 

25 an origin of second strand DNA synthesis and a 3' LTR, wherein said lentiviral vector 
contains a nuclear transport element that is not RRE. 

Briefly, Long Terminal Repeats ("LTRs") are subdivided into three elements, 
designated U5, R and U3. These elements contain a variety of signals which are 
responsible for the biological activity of a retrovirus, including for example, promoto: 

30 and enhancer elements which are located within U3. LTRs may be readily identified 
in the provirus (integrated DNA form) due to their precise duplication at either end of 
the genome. As utilized herein, a 5' LTR should be understood to include a 5' 
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promoter element and suf&cient LTR sequence to allow reverse transcription and 
integration of the DNA form of the vector. The 3' LTR should be understood to 
include a polyadenylation signal, and sufiBcient LTR sequence to allow reverse 
transcription and integration of the DNA form of the vector. 

The tRNA binding site and origin of second strand DNA synthesis are also 
important for a retroviras to be biologically active, and may be readily identified by 
one of skill in the art. For example, retroviral tRNA binds to a tRNA binding site by 
Watson-Crick base pairing, and is carried with the retrovirus genome into a viral 
particle. The tRNA is then utilized as a primer for DNA synthesis by reverse 
transcriptase. The tRNA binding site may be readily identified based upon its 
location just downstream from the 5'LTR. Similarly, the origin of second strand DNA 
synthesis is, as its name implies, important for the second strand DNA synthesis of a 
retrovirus. This region, which is also referred to as the poly-purine tract, is located 
just upstream of the 3*LTR. 

In addition to a 5* and 3' LTR, tRNA binding site, and origin of second strand 
DNA synthesis, recombinant retroviral vector constructs may also comprise a 
packaging signal, as well as one or more genes or coding sequences of interest. In 
addition, the lentiviral vectors have a nuclear transport element which, in preferred 
embodiments is not RRE. Representative examples of suitable nuclear transport 
elements include the element in Rous sarcoma virus (Ogert, et aL, J ViroL 70, 3834- 
3843, 1996), the element in Rous sarcoma virus (Liu & Mertz, Genes & Dev., 9, 
1766-1789, 1995) and the element in the genome of simian retrovirus type I 
(Zolotukhin, et al,, J ViroL 68, 7944-7952, 1994). Other potential elements include 
the elements in the histone gene (Kedes, Annu. Rev. Biochem. 48, 837-870, 1970), the 
a-interferon gene (Nagata et al.. Nature 287, 401-408, 1980), the p-adrenergic 
receptor gene (Koilka, et al.. Nature 329, 75-79, 1987), and the c-Jun gene (Hattorie, 
etal.,Proc. Natl. Acad. ScL 55, 9148-9152, 1988). 

Recombinant lentiviral vector constructs typically lack both Gag-polymerase 
and Env coding sequences. Recombinant lentivn*al vector typically contain less than 
20, preferably 15, more preferably 10, and most preferably 8 consecutive nucleotides 
found in Gag-polymerase and Env genes. One advantage of the present invention is 
that the synthetic Gag-polymerase expression cassettes, which can be used to 
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construct packaging cell lines for the recombinant retroviral vector constructs, have 
little homology to wild-type Gag-polymerase sequraces and thus considerably reduce 
or eliminate the possibility of homologous recombination between tiie synthetic and 
wild-type sequences. 
5 Lentiviral vectors may also include tissue-specific promoters to drive 

expression of one or more genes or sequences of interest. 

Lentiviral vector constructs may be generated such that more than one gene of 
interest is expressed. This may be accomplished through the use of di- or oligo- 
cistronic cassettes (e.g., where the coding regions are separated by 80 nucleotides or 

10 less, see generally Levin et al.. Gene 108:167-174, 1991), or through the use of 
Internal Ribosome Entry Sites ("IRES"). 

Packaging cell lines suitable for use with the above described recombinant 
retroviral vector constructs may be readily prepared given the disclosure provided 
herein. Briefly, the parent cell line from which the packaging cell line is derived can 

15 be selected from a variety of manmialian cell lines, including for example, 293, RD, 
COS-?, CHO, BHK, VERO, HT1080, and myeloma cells. 

After selection of a suitable host cell for the generation of a packaging cell 
line, one or more expression cassettes are introduced into the cell line in order to 
complement or supply in trans components of the vector which have been deleted. 

20 Representative examples of suitable expression cassettes have been described 

herein and include synthetic Env, synthetic Gag, synthetic Gag-protease, and synthetic 
Gag-polymerase expression cassettes, which comprise a promoter and a sequence 
encoding, e.g., Gag-poljmierase and at least one of vpr, vpu, nef or vif, wherein the 
promoter is operably linked to Gag-polymerase and vpr, vpu, nef or vif As described 

25 above, the native and/or modified Env coding sequences may also be utilized in these 
expression cassettes. 

Utilizing the above-described expression cassettes, a wide variety of 
packaging cell lines can be generated. For example, within one aspect packaging cell 
line are provided comprising an expression cassette that comprises a sequence 

30 encoding synthetic Gag-polymerase, and a nuclear transport element, wherein the 

promoter is operably linked to the sequence encoding Gag-polymerase. Within other 
aspects, packaging cell lines are provided comprising a promoter and a sequence 
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encoding tat, rev, Env, or other HIV antigens or epitopes derived th^jfrom, wherein 
the promoter is operably linked to the sequence encoding tat, rev, Env, or the HIV 
antigen or epitope. Within further embodiments, the packaging cell line may 
conq>rise a sequence encoding any one or more of nef, vif, vpu or vpr. For example, 
5 the packaging cell line may contain only nef, vif, vpu, or vpr alone, nef and vif, nef 
and vpu, nef and vpr, vif and vpu, vif and vpr, vpu and vpr, nef vif and vpu, nef vif 
and vpr, nef vpu and vpr, wir vpu and vpr, or, all four of nef vif vpu and vpr. 

In one embodiment, the expression cassette is stably integrated. Within 
another embodiment, the packaging cell line, upon introduction of a lenti viral vector, 
10 produces particles. Within further embodiments the promoter is inducible. Within 
certain preferred embodiments of the invention, the packaging cell line, upon 
introduction of a lenti viral vector, produces particles that are free of replication 
competent virus. 

The synthetic cassettes containing optimized coding sequences are transfected 

15 into a selected cell line. Transfected cells are selected that (i) carry, typically, 
integrated, stable copies of the Gag, Pol, and Env coding sequences, and (ii) are 
expressing acceptable levels of these polypeptides (expression can be evaluated by 
methods known in the prior art, e.g., see Examples 1-4). The ability of the cell line to 
produce VLPs may also be verified. 

20 A sequence of interest is constructed into a suitable viral vector as discussed 

above. This defective virus is then transfected into the packaging cell line. The 
packaging cell line provides the viral functions necessary for producing virus-like 
particles into which the defective viral genome, containing the sequence of interest, 
are packaged. These VLPs are then isolated and can be used, for example, in gene 

25 delivery or g^ie therapy. 

Further, such packaging cell lines can also be used to produce VLPs alone, 
which can, for example, be used as adjuvants for administration with other antigens or 
in vaccine compositions. Also, co-expression of a selected sequence of interest 
encoding a polyp^tide (for example, an antigen) in the packaging cell line can also 

30 result in the entrapment and/or association of the selected polypeptide in/with the 
VLPs. 
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2.3 DNA Immunization and Gene Delivery 

A variety of HIV polypeptide antigens, particularly Type C HTV antigens, can 
be Vised in the practice of the present invention. HTV antigens can be included in 
DNA immunization constructs containing, for exanq)le, a synthetic Gag expression 
5 cassette fused in-frame to a coding sequence for the polypq)tide antigen, where 
expression of the construct results in VLPs presenting the antigCT of interest. 

HIV antigens of particular interest to be used in the practice of the present 
invention include tat, rev, nef, vif^ vpu, vpr, and other HTV antigens or epitopes 
derived therefrom. For example, the packaging cell line may contain only nef, and 

10 HIV-1 (also known as HTLV-HI, LAV, ARV, etc.), including, but not limited to, 

antigens such as gpl20, gp41, gpl60 (both native and modified); Gag; and pel from a 
variety of isolates including, but not limited to, HIVj,,^,, HTVsp^, HIV-lsp,^2» HIV-1 sp,7o, 
HIVlav* HIVlai, HIVmn, HIV-1cm235» HIV-1us4, other HIV-1 strains from diverse 
subtypes(e.g., subtypes, A through G, and O), IIIV-2 strains and diverse subtypes 

15 (e.g., HrV-2uci and HIV-2uc2)- See, e.g., Myers, et al., Los Alamos Database, Los 
Alamos National Laboratory, Los Alamos, New Mexico; Myers, et al.. Human 
Retroviruses and Aids, 1990^ Los Alamos, New Mexico: Los Alamos National 
Laboratory. 

To evaluate eflBcacy, DNA immunization using synthetic expression cassettes 
20 of the present invention can be performed, for instance as described in Example 4. 

Mice are immunized with both the Gag (and/or Env) synthetic expression cassette and 
the Gag (and/or Env) wild type expression cassette. Mouse immunizations with 
plasmid-DNAs will show that the synthetic expression cassettes provide a clear 
improvement of immunogenicity relative to the native expression cassettes. Also, the 
25 second boost immunization will induce a secondary immune response, for example, 
after approximately two weeks. Further, the results of CTL assays will show 
increased potency of synthetic Gag (and/or Env) expression cassettes for induction of 
cytotoxic T-lymphocyte (CTL) responses by DNA immunization. 

It is readily apparent that the subject invention can be used to mount an 
30 immune response to a wide variety of antigens and hence to treat or prevent a HIV 
infection, particularly Type C HIV infection. 
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23.1 Delivery op the synthetic expression cassettes of the 

present invention 
Polynucleotide sequences coding for the above-described molecules can be 
obtained using recombinant methods, such as by screening cDNA and genomic 
5 libraries from cells expressing the gene, or by deriving the gene from a vector known 
to include the same. Furthemiore, the desired gene can be isolated directly from cells 
and tissues containing the same, using standard techniques, such as phenol extraction 
and PGR of cDNA or genomic DNA. See, e,g., Sambrook et al., supra^ for a 
description of techniques used to obtain and isolate DNA. The gene of interest can 
10 also be produced synthetically, rather than cloned. The nucleotide sequence can be 
designed with the appropriate codons for the particular amino acid sequence desired. 
In general, one will select preferred codons for the intended host in which the 
sequence will be expressed. The complete sequence is assembled from overlapping 
oligonucleotides prepared by standard methods and assembled into a complete coding 
15 sequence. See, e.g.. Edge, Nature (1981) 292:756: Nambair et al.. Science (1984) 

223:1299; Jay et al.,J. Biol Chem, (1984) 259:631 1; Stemmer, W.P.C., (1995) Gene 
164:49-53. 

Next, the gene sequence encoding the desired antigen can be inserted into a 
vector containing a synthetic Gag or synthetic Env expression cassette of the present 

20 invention. The antigen is inserted into the synthetic Gag coding sequence such that 
when the combined sequence is expressed it results in the production of VLPs 
comprising the Gag polypeptide and the antigen of interest, e.g., Env (native or 
modified) or other antigen derived from HTV. Insertions can be made within tfie 
coding sequence or at either end of the coding sequence (5*, amino terminus of the 

25 expressed Gag polypeptide; or 3', caifooxy terminus of the expressed Gag 

polypeptide)(Wagner, R., et ^UArch ViroL 127:117-137, 1992; Wagner, R., et al.. 
Virology 200:162-175, 1994; Wu, X., et al., Jl Virol 69(6):3389-3398, 1995; Wang, 
C-T., et al.. Virology 200:524-534, 1994; Chazal, N., et al., Virology 68(1): 1 1 1-122, 
1994; Griffiths, J.C., et al.,y. Virol 67(6):3191-3198, 1993; Reicin, A.S., et aL, J. 

30 Virol 69(2):642-650, 1995). 

Up to 50% of the coding sequences of p55Gag can be deleted without 
affecting the assembly to virus-like particles and expression efficiency (Borsetti, A., 
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et al, J. Virol 72(1 1):93 13-93 17, 1998; Gamier, L., et al., / Virol lliS^Af^^l-A^ll, 
1998; Zhang, Y., et al., J Virol 72(3): 1782-1 789, 1998; Wang, C, et al., J Virol 
72(10): 7950-7959, 1998). In one embodiment of the present invCTition, 
immunogenicity of the high level expressing synthetic Gag expression cassettes can 
5 be increased by the insertion of diiferent structural or non-structural HIV antigens, 
multiepitope cassettes, or cytokine sequences into deleted regions of Gag sequence. 
Such deletions may be generated following the teachings of the present invention and 
information available to one of ordinary skill in the art. One possible advantage of 
this approach, relative to using full-length sequences fiised to heterologous 
1 0 polypeptides, can be higher expression/secretion efBciency of the expression product. 
When sequences are added to the amino terminal end of Gag, the 
polynucletide can contain coding sequences at the 5' end that encode a signal for 
addition of a myristic moiety to the Gag-containing polypeptide (e.g., sequences that 
encode Met-Gly). 

15 The ability of Gag-containing polypeptide constructs to form VLPs can be 

empirically determined following the teachings of the present specification. 

Gag/antigen {e.g., Gag/Env^ synthetic expression cassettes include control 
elements operably Unked to the coding sequence, which allow for the expression of 
the gene in vivo in the subject species. For example, typical promoters for 

20 mammalian cell expression include the SV40 early promoter, a CMV promoter such 
as the CMV immediate early promoter, the mouse mammary tumor virus LTR 
promoter, the adenovirus major late promoter (Ad MLP), and the herpes simplex virus 
promoter, among otho^. Other nonviral promoters, such as a promoter derived fsora 
the murine metallothionein gene, will also find use for manmialian expression. 

25 Typically, transcription termination and polyadenylation sequences will also be 
present, located 3* to the translation stop codon. Preferably, a sequence for 
optimization of initiation of translation, located 5* to the coding sequence, is also 
present. Examples of transcription terminator/polyadenylation signals include those 
derived from SV40, as described in Sambrook et al., supra^ as well as a bovine 

30 growth hormone terminator sequence. 

Enhancer elements may also be used herein to increase expression levels of 
the mammalian constructs. Examples include the SV40 early gene enhancer, as 
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described in Dijkema et al., EMBOJ. (1985) 4:761. the enhancer/promoter derived 
fix)m the long tenninal repeat (LTR) of the Rous Sarcoma Virus, as described in 
Gorman et ai.. Proc. Natl. Acad. Sci. USA (1982b) 72:6777 and elements derived 
from human CMV, as described in Boshart et al.. Cell (1985) 41:521, such as 
5 elements included in the CMV intron A sequence. 

Furthermore, plasmids can be constructed which include a chimeric antigen- 
coding gene sequences, encoding, e.g., multiple antigens/epitopes of interest, for 
exaaapXe derived from more than one viral isolate. 

Typically, the antigOT coding sequences precede or follow die synthetic coding 
1 0 sequence and the chimc^c transcription unit will have a single open reading frame 
encoding both the antigen of interest and the synthetic Gag coding sequences. 
Alternatively, multi-cistronic cassettes (e.g., bi-cistronic cassettes) can be constructed 
allowing expression of multiple antigens from a single mRNA using the EMCV 
IRES, or the like. 

1 5 Once complete, the constructs are used for nucleic acid immunization using 

standard gene delivery protocols. Methods for gene delivery are known in the art. 
See, e.g., U.S. Patent Nos. 5.399,346, 5.580,859, 5,589,466. Genes can be delivered 
either directly to the vertebrate subject or. alternatively, delivered ex vivo, to cells 
derived from the subject and the cells reimplanted in the subject. 

20 A number of viral based systems have been developed for gene transfer into 

mammalian cells. For example, retroviruses provide a convraiient platform for gene 
delivery systems. Selected sequences can be insCTted into a vector and packaged in 
retroviral particles using techniques known in the art The recombinant virus can then 
be isolated and delivered to cells of the subject either in vivo or ex vivo. A number of 

25 r^roviral systems have been described (U.S. Patent No. 5,219,740; Miller and 
Rosman, BioTechniques (1989) 7:980-990; Miller, A.D.. Human Gene Therapy 
(1990) 1:5-14; Scarpa et al.. Virology (1991) 180:849-852; Bums et al., Proc. Natl. 
Acad. Sci. USA (1993) 2Q:8033-8037; and Boris-Lawrie and Temin, Cur. Opin. 
Genet. Develop. (1993)1:102-109. 

30 A number ofadaiovirus vectors have also been described. Unlike retroviruses 

which integrate into the host genome, adenovirases persist extrachromosomally thus 
minimizing the risks associated with insertional mutagenesis (Haj-Ahmad and 
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Graham, y. ViroL (1986) 57:267-274; Bettetal., 7. Virol, (1993)67:5911-5921; 
Mittereder et al.. Human Gene Therapy (1994) 1:717-729; Seth et al., X ViroL (1994) 
68:933-940; Bair et al.. Gene Therapy (1994) 1:51-58; B«kner, K.L. BioTechniques 
(1988) 6:616-629; and Rich et al.. Human Gene Therapy (1993) 4:461-476). 
5 Additionally, various adeno-associated virus (AAV) vector systems have been 

developed for gene delivery. AAV vectors can be readily constracted using 
techniques well known in the art See, e.g., U.S. Patent Nos. 5,173,414 and 5,139,941; 
Int^ational Publication Nos. WO 92/01070 (pubUshed 23 January 1992) and WO 
93/03769 (pubUshed 4 March 1993); Lebkowski et al., Molec. Cell Biol (1988) 

10 8:3988-3996; Vincent et al.. Vaccines 90 (1990) (Cold Spring Harbor Laboratory 

Press); Carter, B.J. Current Opinion in Biotechnology (1992) 3:533-539; Muzyczka, 
N. Current Topics in Microbiol and Immunol (1992) 158:97-129; Kotin, R.M. 
Human Gene Therapy (1994) 5:793-801; Shelling and Smith, Gene Therapy (1994) 
1:165-169; and Zhou et al., J. Exp. Med. (1994) 179:1867-1875. 

15 Another vector system useful for dehvering the polynucleotides of the present 

invention is the enterically administered recombinant poxvirus vaccines described by 
Small, Jr., P.A., et al. (U.S. Patent No. 5,676,950, issued October 14, 1997). 

Additional viral vectors which will find use for delivering the nucleic acid 
molecules encoding the antigens of interest include those derived from the pox family 

20 of viruses, including vaccinia virus and avian poxvirus. By way of example, vaccinia 
virus recombinants expressing the genes can be constructed as follows. The DNA 
encoding the particular synthetic Gag/ or Env/antigen coding sequence is first inserted 
into an appropriate vector so that it is adjacent to a vaccinia promoter and flanking 
vaccinia DNA sequences, such as the sequence oicoding thymidine kinase (TK). 

25 This vector is thra used to transfect cells which are simultaneously infected with 

vaccinia. Homologous recombination serves to insert the vaccinia promoter plus the 
gene encoding the coding sequraces of interest into the viral genome. The resulting 
TK'recombinant can be selected by culturing the cells in the presence of 5- 
bromodeoxyuridine and picking viral plaques resistant thereto. 

30 Alternatively, avipoxvimses, such as the fowlpox and canarypox viruses, can 

also be used to deliver the genes. Recombinant avipox viruses, expressing 
immunogens fi^om mammalian pathogens, are known to confer protective immunity 
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when administered to non-avian species. The use of an avipox vector is particularly 
desirable in human and other mammalian species since members of the avipox genus 
can only productively rqjlicate in susceptible avian species and therefore are not 
infective in mammalian cells. Methods for producing recombinant avipoxviruses are 
5 known in the art and employ genetic recombination, as described above with respect 
to the production of vaccinia viruses. See, e.g., WO 91/12882; WO 89/03429; and 
WO 92/03545. 

Molecular conjugate vectors, such as the adenovirus chimeric vectors 
described in Michael et al., J. Biol Chem. (1993) 268:6866-6869 and Wagner et al., 

10 Proc. Natl Acad. Sci. USA (1992) 89:6099-6103, can also be used for gene delivery. 

Members of the Alphavirus genus, such as, but not limited to, vectors derived 
from the Sindbis, Semliki Forest, and Venezuelan Equine Encephalitis viruses, will 
also find use as viral vectors for delivering the polynucleotides of the present 
invention (for example, a synthetic Gag-polypqjtide encoding expression cassette). 

1 5 For a description of Sindbis- virus derived vectors useful for the practice of the instant 
methods, see, Dubensky et al., X Virol (1996) 70:508-519; and International 
Publication Nos. WO 95/07995 and WO 96/17072; as well as, Dubensky, Jr., T.W., et 
al., U.S. Patent No. 5,843,723, issued December 1, 1998, and Dubensky, Jr., T.W., 
U.S. Patent No. 5,789,245, issued August 4, 1 998. 

20 A vaccinia based infection/transfection system can be conveniently used to 

provide for inducible, transient expression of the coding sequences of interest in a 
host cell. In this system, cells are first infected in vitro with a vaccinia virus 
recombinant that encodes the bacteriophage T7 RNA polymerase. This polymerase 
displays exquisite specificity in that it only transcribes templates bearing T7 

25 promoters. Following infection, cells are transfected with the polynucleotide of 

interest, driven by a T7 promoter. The polymerase expressed in the cytoplasm from 
the vaccinia virus recombinant transcribes the transfected DNA into RNA which is 
then translated into protein by the host translational machinery. The method provides 
for high level, transient, cytoplasmic production of large quantities of RNA and its 

30 translation products. See, e.g., Ehx>y-Stein and Moss, Proc. Natl Acad. Set USA 

(1990) 87:6743-6747; Fuerst et al., Proc. Natl. Acad. Sci. USA (1986) 83:8122-8126. 
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As an alternative approach to infection with vaccinia or avipox virus 
recombinants, or to the delivery of genes using other viral vectors, an amplification 
system can be used that will lead to high level expression following introduction into 
host cells. Specifically, a T7 RNA polymorase promoter preceding the coding region 
5 for T7 RNA polymerase can be engineered. Translation of RNA derived fix)m this 
t^plate will graerate 17 RNA polymerase which in turn will transcribe more 
template. Concomitantiy, there will be a cDNA whose expression is under the control 
of the T7 promoter. Thus, some of the T7 RNA polymerase generated fit)m 
translation of the amplification template RNA will lead to transcription of the desired 

1 0 gene. Because some T7 RNA polymerase is required to initiate the amplification, T7 
RNA polymerase can be introduced into cells along with the tCTiplate(s) to prime the 
transcription reaction. The polymerase can be introduced as a protein or on a plasmid 
encoding the'RNA polymerase. For a further discussion of T7 systems and their use 
for transforming cells, see, e.g.. International Publication No. WO 94/269 11; Studier 

15 and Moffatt, J. MoL BioL (1986) 189:1 13-130; Deng and Wolff, Gene (1994) 

143:245-249; Gao et aL, Biochem, Biophys, Res, Commun. (1994) 200:1201-1206; 
Gao and Huang, Nuc, Acids Res. (1993) 21:2867-2872; Chen et aL, Nuc. Acids Res. 
(1994) 22:2114-2120; and U.S. Patent No. 5,135,855. 

A synthetic Gag- and/or Env-containing expression cassette of interest can 

20 also be delivered without a viral vector. For example, the synthetic expression 
cassette can be packaged in liposomes prior to delivery to the subject or to cells 
derived therefrom. Lipid encapsulation is generally accomplished using liposomes 
which are able to stably bind or entrap and retain nucleic acid. The ratio of condensed 
DNA to lipid preparation can vary but will generally be around 1 :1 (mg 

25 DNArmicromoles lipid), or more of lipid. For a review of the use of liposomes as 
carriers for delivery of nucleic acids, see. Hug and Sleight, Biochim, Biophys. Acta. 
(1991) 1097 :1-17: Straubinger et al., in Methods ofEnzymology (1983), Vol. 101, pp. 
512-527. 

Liposomal preparations for use in the preset invention include cationic 
30 (positively charged), anionic (negatively chai:ged) and neutral preparations, with 
cationic liposomes particularly preferred. Cationic liposomes have been shown to 
mediate intracellular delivery of plasmid DNA (Feigner et al., Proc, Natl Acad, Sci. 
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USA (1987) 84:7413-7416); mRNA (Malone et al., Proc, Natl Acad. Set USA (1989) \ 
86:6077-6081); and purified transcription factors (Debs et al., J. Biol Chem. (1990) | 

265:10189-10192), in functional form. j 

i 

Cationic liposomes are readily available. For exan^)le,N[ 1-2,3- | 

I 

5 dioleyloxy)propyl]-N,N,N-triethylammonium (DOTNfA) liposomes are available | 

i 

under the trademark Lipofectin, from GIBCO BRL, (jrand Island, NY. (See, also, j 
Feigner et al., /Vw. Nail. Acad ScL USA (1987) 84:7413-7416). Other commCTcially 
available lipids include (DDAB/DOPE) and DOTAP/DOPE (Boerhinger). Other | 
cationic liposomes can be prepared from readily available materials using techniques 

10 well known in the art. See, e.g., Szoka et al., Proc. Natl Acad. Set USA (1978) i 
75:4194-4198; PCT Publication No. WO 90/1 1092 for a description of the synthesis 
of DOTAP (l,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as, from 
Avanti Polar Lipids (Birmingham, AL), or can be easily prepared using readily | 

1 5 available materials. Such materials include phosphatidyl choline, cholesterol, ^ 
phosphatidyl ethanolamine, dioleoylphosphatidyl choline (DOPC), 
dioleoylphosphatidyl glycerol (DOPG), dioleoylphoshatidyl ethanolamine (DOPE), 
among others. These materials can also be mixed with the DOTMA and DOTAP i 
starting materials in appropriate ratios. Methods for making liposomes using these 

20 materials are well known in the art. 

The liposomes can comprise multilammelar vesicles (MLVs), small 
unilamellar vesicles (SUVs), or large unilamellar vesicles (LUVs). The various 
liposome-nucleic acid complexes are prepared using methods known in the art. See, 
e.g., Straubinger et al., m METHODS OF IMMUNOLOGY (1 983), Vol. 1 01 , pp. 

25 512-527; Szoka et al., Proc. Natl. Acad. Sci. USA (1978) 75:4194-4198; 

Papahadjopoulos et al., Biochim. Biophys. Acta (1975) 394:483; Wilson et al.. Cell 
(1979) 17:77); Deamer and Bangham, Biochim. Biophys. Acta (1976) 443:629; Ostro 
et al., Biochem. Biophys. Res. Commun. (1977) 76:836; Fraley et al., Proc. Natl. 
Acad. Sci. USA (1979) 76:3348); Enoch and Strittmatter, Proc. Natl. Acad. Sci. USA 

30 (1979) 76:145); Fraley et al., J.Biol. Chem. (1980) 255:10431; Szoka and 

Papahadjopoulos, Proc. Natl Acad. ScL USA (1978) 75:145; and Schaefer-Ridder et 
al.. Science (1982) 215:166. 
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The DNA and/or protein antigen(s) can also be delivered in cochleate lipid 
compositions similar to those described by Papahadjopoulos et al., Biochem. Biophys. 
Acta. (1975) 394:483-491. See, also, U.S. Patent Nos. 4,663,161 and 4,871,488. 
The synthetic expression cassette of interest may also be mc^sulated, 
5 adsoibed to, or associated with, particulate carriers. Such carriers present multiple 
copies of a selected antigen to the immune system and promote trapping and retention 
of antigens in local lymph nodes. The particles can be phagocytosed by macrophages 
and can enhance antigen presentation through cj^okine release. Examples of 
particulate carriers include those derived from polymethyl methacrylate polymers, as 

10 well as microparticles derived from poly(lactides) and poly(lactide-co-glycolides), 

known as PLG. See, e.g., JefTery et al., Pharm. Res. (1993) 10:362-368; McGee JP, et 
2l.,JMicroencapsul 14(2): 197-210, 1997; CHagan DT, et al.. Vaccine ll(2):149-54, 
1993. Suitable microparticles may also be manufactured in the presence of charged 
detergents, such as anionic or cationic detergents, to yield microparticles with a 

1 5 surface having a net negative or a net positive charge. For example, microparticles 

manufactured with anionic detergents, such as hexadecyltrimethylammonium bromide 
(CTAB), i.e. CTAB-PLG microparticles, adsorb negatively charged macromolecules, 
such as DNA. (see, e.g., IntM Application Number PCTAJS99/17308). 

Furthermore, other particulate systems and polymers can be used for the in 

20 vivo or ex vivo delivery of the gene of interest. For example, poljmiers such as 

polylysine, polyarginine, polyomithine, spermine, spermidine, as well as conjugates 
of these molecules, are useful for transferring a nucleic acid of interest. Similarly, 
DEAE dextran-mediated transfection, calcium phosphate precipitation or precipitation 
using other insoluble inorganic salts, such as strontium phosphate, aluminum silicates 

25 including bentonite and kaolin, chromic oxide, magnesium silicate, talc, and the like, 
will find use with the present methods. See, e.g.. Feigner, P.L., Advanced Drug 
Delivery Reviews (1990) 5:163-1 87, for a review of delivery systems useful for gene 
transfer. Peptoids (Zuckerman, R.N., et al., U.S. Patent No. 5,831,005, issued 
November 3, 1998) may also be used for delivacy of a construct of the present 

30 invention. 

Additionally, biolistic delivery systems employing particulate carriers such as 
gold and tungsten, are especially useful for delivering synthetic expression cassettes 
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of the present invention. The particles are coated with the synthetic expression 
cassette(s) to be delivered and accelerated to high velocity, generally under a reduced 
atmosphere, using a gun powder discharge from a "gene gun." For a description of 
such techniques, and apparatuses useful therefore, see, e.g., U.S. Patent Nos. 
5 4,945,050; 5,036,006; 5,100,792; 5,179,022; 5,371,015; and 5,478,744. Also, needle- 
less injection systems can be used (Davis, H.L., et al. Vaccine 12:1503-1509, 1994; 
Bioject, Inc., Portland, OR). 

Recombinant vectors carrying a synthetic expression cassette of the present 
invention are formulated into compositions for delivery to the vertebrate subject. 

1 0 These compositions may either be prophylactic (to prevent infection) or therapeutic 
(to treat disease after infection). The compositions will comprise a "therapeutically 
effective amount" of the gene of interest such that an amount of the antigen can be 
produced in vivo so that an immune response is generated in the individual to which it 
is administered. The exact amount necessary will vary depending on the subject 

15 being treated; the age and general condition of the subject to be treated; the capacity 
of the subject's immune system to synthesize antibodies; the degree of protection 
desired; the severity of the condition being treated; the particular antigen selected and 
its mode of administration, among other factors. An appropriate effective amount can 
be readily determined by one of skill in the art. Thus, a "therapeutically effective 

20 amount" will fall in a relatively broad range that can be determined through routine 
trials. 

The compositions will generally include one or more "pharmaceutically 
acceptable excipients or vehicles" such as water, saline, glycerol, polyethyleneglycol, 
hyaluronic acid, ethanol, etc. Additionally, auxiliary substances, such as wetting or 

25 emulsifying agents, pH buffering substances, and the like, may be present in such 
vehicles. Certain facilitators of nucleic acid uptake and/or expression can also be 
included in the compositions or coadministered, such as, but not limited to, 
bupivacaine, cardiotoxin and sucrose. 

Once formulated, the compositions of the invention can be administered 

JO directly to the subject (e.g., as described above) or, altOTiatively, dehvered ex vivo, to 
cells derived from the subject, using methods such as those described above. For 
example, methods for the ex vivo delivery and reimplantation of transformed cells into 
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a subject are known in the art and can include, e.g., dextran-mediated transfection, 
calcium phosphate precipitation, polybrene mediated transfection, lipofectamine and 
LT-1 mediated transfection, protoplast fusion, electroporation, encapsulation of the 
polynucleotide(s) (with or without the corresponding antigen) in liposomes, and direct 
5 microinjection of the DNA into nuclei. 

Direct delivery of synthetic expression cassette composfitions in vivo will 
generally be accomplished with or without viral vectors, as described above, by 
injection using either a conventional syringe or a gene gun, such as the Accell® gene 
delivery system (PowderJect Technologies, Inc., Oxford, England). The constracts 

10 can be injected either subcutaneously, epidermally, intradermally, intramucosally such 
as nasally, rectally and vaginally, intraperitoneally, intravenously, orally or 
intramuscularly. Delivery of DNA into cells of the epidermis is particularly preferred 
as this mode of administration provides access to skin-associated lymphoid cells and 
provides for a transient presence of DNA in the recipient. Other modes of 

15 administration include oral and pulmonary administration, suppositories, needle-less 
injection, transcutaneous and transdermal applications. Dosage treatment may be a 
single dose schedule or a multiple dose schedule. Administration of nucleic acids may 
also be combined with administration of peptides or other substances. 

20 2.3.2 Ex vivo Delivery of the synthetic expression cassettes of 

THE present invention 

In one embodiment, T cells, and related cell types (including but not limited to 
antigen presenting cells, such as, macrophage, monocj^es, lymphoid cells, dendritic 
cells, B-cells, T-cells, stem cells, and progenitor cells thereof), can be used for ex vivo 

25 delivery of the synthetic expression cassettes of the present invention; T cells can be 
isolated from peripheral blood lymphocytes (PBLs) by a variety of procedures known 
to those skilled in the art. For example, T cell populations can be "enriched" from a 
population of PBLs through the removal of accessory and B cells. In particular, T cell 
enrichment can be accomplished by the elimination of non-T cells using anti-MHC 

30 class II monoclonal antibodies. Similarly, other antibodies can be used to deplete 

specific populations of non-T cells. For example, anti-Ig antibody molecules can be 
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used to deplete B cells and anti-MacI antibody molecules can be used to dq)lete 
. macrophages. 

T cells can be further fractionated into a number of different subpopulations 
by techniques known to those skilled in the art. Two major subpopulations can be 
5 isolated based on their differential e?q>ression of the cell surface markers CD4 and 
CDS. For example, following the enrichment of T cells as described above, CD4* 
cells can be enriched using antibodies specific for CD4 (see Coligan et al., supra). 
The antibodies may be coupled to a solid support such as magnetic beads. 
Conversely, CD8+ cells can be enriched through the use of antibodies specific for 
1 0 CD4 (to remove CIM* cells), or can be isolated by the use of CDS antibodies coupled 
to a solid support. CD4 lymphocytes from HIV-1 infected patients can be expanded 
ex VIVO, before or after transduction as described by Wilson et. al. (1995) J, Infect. 
Dis, 172:88. 

Following purification of T cells, a variety of methods of genetic modification 

15 known to those skilled in the art can be performed using non- viral or viral-based gene 
transfer vectors constructed as described herein. For example, one such approach 
involves transduction of the purified T cell population with vector-containing 
supematant of cultures derived fix)m vector producing cells. A second approach 
involves co-cultivation of an irradiated monolayer of vector-producing cells with the 

20 purified T cells. A third approach involves a similar co-cultivation approach; 

however, the purified T cells are pre-stimulated with various cytokines and cultured 
48 hours prior to the co-cultivation with the irradiated vector producing cells. Pre- 
stimidation prior to such transduction increases effective gene transfer (Nolta et al. 
(1992) Exp. HematoL 20:1065). Stimulation of these cultures to proliferate also 

25 provides increased cell populations for re-infiision into the patient. Subsequent to co- 
cultivation, T cells are collected from the vector producing cell monolayer, expanded, 
and frozen in liquid nitrogen. 

Gene transfer vectors, containing one or more synthetic expression cassette of 
the preset invention (associated with sq>propriate control elements for delivery to the 

30 isolated T cells) can be assembled using known methods. 

Selectable maikers can also be used in the construction of gene transfer 
vectors. For example, a maricer can be used which imparts to a mammalian cell 
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transduced with the gene transfer vector resistance to a cytotoxic agent. The cytotoxic 
agent can be, but is not limited to, neomycin, aminoglycoside, tetracycline, 
chloramphenicol, sulfonamide, actinomycin, netropsin, distamycin A, anthracycline, 
or pyrazinamide. For example, neomycin phosphotransferase n imparts resistance to 
S the neomycin analogue geneticin (G418). 

The T cells can also be maintained in a medium containing at least one type of 
growth factor prior to being selected. A variety of growth factors are known in the art 
which sustain the growth of a particular cell type. Examples of such growth factors 
are cytokine mitogens such as rIL-2, IL-10, IL-12, and IL-15, which promote growth 

1 0 and activation of lymphocytes. Certain types of cells are stimulated by other growth 
factors such as hormones, including human chorionic gonadotropin (hCG) and human 
growth hormone. The selection of an appropriate growth factor for a particular cell 
population is readily accomplished by one of skill in the art. 

For example, white blood cells such as differentiated progenitor and stem cells 

15 are stimulated by a variety of growth factors. More particularly, IL-3, IL-4, IL-5, IL- 
6, IL-9, GM-CSF, M-CSF, and G-CSF, produced by activated Th and activated 
macrophages, stimulate myeloid stem cells, which then differentiate into pluripotent 
stem cells, granulocyte-monocyte progenitors, eosinophil progenitors, basophil 
progenitors, megakaryocytes, and erythroid progenitors. Differentiation is modulated 

20 by growth factors such as GM-CSF, IL-3, IL-6, IL-1 1, and EPO. 

Pluripotent stem cells then differentiate into lymphoid stem cells, bone 
marrow stromal cells, T cell progenitors, B cell progenitors, thymocytes, T„ Cells, T^ 
cells, and B cells. This differentiation is modulated by growth factors such as IL-3, 
IL-4, IL-6, IL-7, GM-CSF. M-CSF, G-CSF, IL-2, and IL-5. 

25 Granulocyte-monocyte progenitors differentiate to monocytes, macrophages, 

and neutrophils. Such differentiation is modulated by the growth factors GM-CSF, 
M-CSF, and IL-8. Eosinophil progenitors differentiate into eosinophils. This process 
is modulated by GM-CSF and IL-5. 

The differentiation of basophil progenitors into mast cells and basophils is 

30 modulated by GM-CSF, IL-4, and IL-9. Megakaryocytes produce platelets in 

response to GM-CSF, EPO, and IL-6. Erythroid progenitor cells differentiate into red 
blood cells in response to EPO. 
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Thus, during activation by the CD3-binding agent, T cells can also be 
contacted with a mitogen, for example a cytokine such as IL-2. In particularly 
preferred embodiments, the IL-2 is added to the population of T cells at a 
concentration of about 50 to 100 jig/ml. Activation with the CD3-binding agent can 
5 be carried out for 2 to 4 days. 

Once suitably activated, the T cells are genetically modified by contacting the 
same with a suitable gene transfer vector under conditions that allow for transfection 
of the vectors into the T cells. Genetic modification is carried out when the cell 
density of the T cell population is between about 0.1 x 10* and 5 x 10^ preferably 

10 between about 0.5 x 10* and 2 x 10*. A number of suitable viral and nonviral-based 
gene transfer vectors have been described for use herein. 

After transduction, transduced cells are selected away fi-om non-transduced 
cells using known techniques. For example, if the gene transfer vector used in the 
transduction includes a selectable marker which confers resistance to a cytotoxic 

15 agent, the cells can be contacted with the appropriate cytotoxic agent, whereby non- 
transduced cells can be negatively selected away from the transduced cells. If the 
selectable marker is a cell surface marker, the cells can be contacted with a binding 
agent specific for the particular cell surface maricer, whereby the transduced cells can 
be positively selected away from the population. The selection step can also entail 

20 fluorescence-activated cell sorting (FACS) techniques, such as where FACS is used to 
select cells from the population containing a particular surface marker, or the selection 
step can entail the use of magnetically responsive particles as retrievable supports for 
target cell capture and/or background removal. 

More particularly, positive selection of the transduced cells can be performed 

25 using a FACS cell sorter (e.g. a FACSVantage™ Cell Sorter, Becton Dickinson 
Immunocytometry Systems, San Jose, CA) to sort and collect transduced cells 
expressing a selectable cell surface marker. Following transduction, the cells are 
stained with fluorescent-labeled antibody molecules directed against the particular cell 
surface maricer. The amount of bound antibody on each cell can be measured by 

30 passing droplets containing the cells through the cell sorter. By imparting an 

electromagnetic charge to droplets containing the stained cells, the transduced cells 
can be separated from other cells. The positively selected cells are then harvested in 
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sterile collection vessels. These cell sorting procedures are described in detail, for 
example, in the FACSVantage™ Training Manual, with particular reference to 
sections 3-11 to 3-28 and 10-1 to 10-17. 

Positive selection of the transduced cells can also be performed using 
S magnetic separation of cells based on expression or a particular cell surface marker. 
In such separation techniques, cells to be positively selected are first contacted with 
specific binding agent (e.g., an antibody or reagent the interacts specifically with the 
cell surface maiicer). The cells are then contacted with retrievable particles (e.g., 
magnetically responsive particles) which are coupled wifli a reagent that binds the 

10 specific binding agent (that has bound to the positive cells). The cell-binding agent- 
particle complex can then be physically separated from non-labeled cells, for example 
using a magnetic field. When using magnetically responsive particles, the labeled 
cells can be retained in a container using a magnetic filed while the negative cells are 
removed. These and similar separation procedures are known to those of ordinary 

15 skill in the art. 

Expression of the vector in the selected transduced cells can be assessed by a 
number of assays known to those skilled in the art. For example. Western blot or 
Northern analysis can be employed depending on the nature of the inserted nucleotide 
sequence of interest. Once expression has been established and the transformed T 

20 cells have been tested for the presence of the selected synthetic expression cassette, 
they are ready for infusion into a patient via the peripheral blood stream. 

The invention includes a kit for genetic modification of an ex vivo population 
of primary mammalian cells. The kit typically contains a gene transfer vector coding 
for at least one selectable marker and at least one synthetic expression cassette 

25 contained in one or more containers, ancillary reagents or hardware, and instructions 
for use of the kit. 

Experimental 

Below are examples of specific embodiments for carrying out the present 
30 invention. The examples are offi^ed for illustrative purposes only, and are not 
intended to limit the scope of the present invention in any way. 
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Efforts have been made to ensure accuracy with respect to numbers used (e.g., 
amounts, tCTiperatures, etc.), but some e^qjerimental error and deviation should, of 
course, be allowed for. 

5 Example 1 

Generation of Synthetic Expression Cassettes 
Modification of HIV>1 Env. Gav. G ae^protease and Gae-polvmerase Nucleic 
Acid Coding Sequences 

The Gag, Gag-protease^ and Gag-polymerase coding sequences were selected 

10 firom the Type C strains AFl 10965 and AFl 10967. The Env coding sequences were 
selected firom Type C strains AFl 10968 and AFl 10975. These sequences were 
manipulated to maximize expression of their gene products. 

First, the HIV-1 codon usage pattern was modified so that the resulting nucleic 
acid coding sequence was comparable to codon usage found in highly expressed 

15 human genes. The HIV codon usage reflects a high content of the nucleotides A or T 
of the codon-triplet. The effect of the HIV-1 codon usage is a high AT content in the 
DNA sequence that results in a decreased translation ability and instability of the 
mRNA. In comparison, highly expressed human codons prefer the nucleotides G or 
C. The coding sequences were modified to be comparable to codon usage found in 

20 highly expressed human genes. 

Second, there are inhibitory (or instability) elements (INS) located within the 
coding sequences of the Gag and Gag-protease coding sequmces (Schneider R, et al., 
J ViroL 71(7):4892-4903, 1997). RRE is a secondary RNA structure that interacts 
with the HIV encoded Rev-protein to overcome the expression down-regulating 

25 effects of the INS. To overcome the post-transcriptional activating rhechanisms of 
RRE and Rev, the instability elements are inactivated by introducing multiple point 
mutations that do not alter the reading fi:ame of the encoded proteins. Figures 5 and 6 
(SEQ ID Nos: 3, 4, 20 and 21) show the location of some remaining INS in synthetic 
sequences derived from strains AFl 10965 and AFl 10967. The changes made to these 

30 sequences are boxed in the Figures. In Figures 5 and 6, the top Mne depicts a codon 

optimized sequence of Gag polypeptides bom the indicated strains. The nucleotide(s) 
appearing below the Hne in the boxed region(s) depicts changes made to further 
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remove INS. Thus, when the changes indicated in the boxed regions are made, the 
resulting sequences correspond to the sequences depicted in Figures 1 and 2, 
respectively. 

For the Gag-protease sequence, the changes in codon usage are restricted to 
S the regions up to the -1 frameshift and starting again at the end of the Gag reading 
frame. Further, inhibitory (or instability) elements (INS) located within the coding 
sequences of the Gag-protease polypq)tide coding sequence are altered as well. The 
synthetic coding sequences are assembled by methods known in the art, for example 
by companies such as the Midland C^fied Reagent Company (Midland, Texas). 

10 Modification of the Gag-polymerase sequences include similar modifications 

as described for Gag-protease in order to preserve the frameshift region. 

In one embodiment of the invention, the fall length polymerase coding region 
of the Gag-polymerase sequence is included with the synthetic Gag or Env sequences 
in order to increase the number of epitopes for virus-like particles expressed by the 

15 synthetic, optimized Gag/Env expression cassette. Because synthetic HIV-1 Gag- 
polymerase expresses the functional enzymes reverse transcriptase (RT) and integrase 
(INT) (in addition to the structural proteins and protease), it is important to inactivate 
RT and INT fanctions. Several deletions or mutations in the RT and INT coding 
regions can be made to achieve catalytic nonfunctional enzymes with respect to their 

20 RT and INT activity. (Jay. A. Levy (Editor) (1995) The Retroviridae, Plenum Press, 
New York. ISBN 0-306-45033X. Pages 215-20; Grimison, B. and Laurence, J. 
( 1 995), Journal Of Acquired Immune Deficiency Syndromes and Human 
Retrovirology 9(l):58-68; Wakefield, L K.,et al., (1 992) Journal Of Virology 
66(11):6806-6812; Esnouf, R.,et al., (1995) Nature Structural Biology 2i4):303-30S; 

25 Maignan, S., et aL, (1 998) Journal Off Molecular Biology 282(2):359-368; Katz, R. A. 
and Skalka, A. M. (1994) Annual Review Of Biochemistry 73 (1994); Jacobo-Molina, 
A., et al., (1993) Proceedings Of the National Academy Of Sciences Of the United 
States Of America 90(13):6320-6324; Hickman, A. B., et al., (1994) Journal Of 
Biological Chemistry 269(46):29279-29287; Goldgur, Y., et al., (1998) Proceedings 

30 Of the National Academy Of Sciences Of the United States Of America 95(1 6):9150- 
9154; Goette, M., et al., (1998) Journal Of Biological Chemistry 273(17): 101 39- 
10146; Gorton, J. L., et al., (1998) Journal of Virology 72(6):5046-5055; Engelman, 
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A., et al., (1997) Journal Of Virology 71(5):3507-3514; Dyda, R, et al., Science 
266(5193):1981-1986; Davies, J. R, et al., (1991) Science 252(5002):88-95; Bujacz, 
a, et al., (1996) Febs Letters 398(2-3): 175-1 78; Beard, W. A., et al., (1 996) Journal 
Of Biological Chemistry 271(21):12213-12220; Kohlstaedt, L. A., et al., (1992) 
5 Science 256(5065): 1783- 1790; Krag, M. S. and Berger, S. L. (1991) Biochemistry 
30(44):10614-10623; Mazumder, A., et al., {1996) Molecular Pharmacology 
49(4):621-628; Palaniappan, C, et al., (1997) Journal Of Biological Chemistry 
272(17):1 1 157-1 1 164; Rodgers, D. W., et al., (1995) Proceedings Of the National 
Academy Of Sciences Of the UnUed States Of America 92(4):1222-1226; Sheng, N. 
10 and Dennis, D. (1993) Biochemistry 32(18):4938-4942; Spence, R. A., et al., (1995) 
Science 267(5200):988-993.} 

Furthermore selected B- and/or T-cell epitopes can be added to the Gag- 
polymerase constructs within the deletions of the RT- and ESIT-coding sequence to 
replace and augment any epitopes deleted by the functional modifications of RT and 
15 INT. Alternately, selected B- and T-cell epitopes (including CTL epitopes) from RT 
and INT can be included in a minimal VLP formed by expression of the synthetic Gag 
or synthetic GagProt cassette, described above. (For descriptions of known HIV B- 
and T-cell epitopes see, HIV Molecular Immunology Database CTL Search Interface; 
Los Alamos Sequence Compendia, 1987-1997;Intemet address: http://hiv- 
20 web.lanl.gov/immunology/index.html.) 

The resulting modified coding sequences are presented as a synthetic Env 
expression cassette; a synthetic Gag expression cassette; a synthetic Gag-protease 
expression cassette; and a synthetic Gag-polymerase expression cassette. A common 
Gag region (Gag-common) extends from nucleotide position 844 to position 903 
(SEQ ID NO:l), relative to AFl 10965 (or from approximately amino acid residues 
282 to 301 of SEQ ID NO:17) and fi-om nucleotide position 841 to position 900 (SEQ 
ID NO:2), relative to AFl 10967 (or from approximately amino acid residues 281 to 
300 of SEQ ID NO:22). A common Env region (Env-common) extends from 
nucleotide position 1213 to position 1353 (SEQ ID NO:5) and amino acid positions 
405 to 451 of SEQ ID NO:23, relative to AFl 10968 and from nucleotide position 
1210 to position 1353 (SEQ ID NO: 11) and amino acid positions 404-45 1 (SEQ ID 
NO:24), relative to AFl 10975. 
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The synthetic DNA fragments for Gag and Env are cloned into tihe following 
eucaryotic expression vectors: pCMVKm2, for transient expression assays and DNA 
immunization studies, the pCMVKm2 vector is derived from pCMV6a (Ch^man et 
al., Nuc. Acids Res. (1991) 19:3979-3986) and comprises a kanamycin selectable 
5 marker, a ColEl origin of rq>lication, a CMV promoter enhancer and Intron A, 

followed by an insertion site for the synthetic sequences described below followed by 
a polyadenylation signal derived from bovine growth hormone — the pCMVKm2 
vector differs from the pCMV-Iink vector only in that a polylinker site is inserted into 
pCMVKm2 to generate pCMV-link; pESN2dhfr and pCMVPLEdhfr, for expression 
10 in Chinese Hamster Ovary (CHO) cells; and, pAcC13, a shuttle vector for use in the 
Baculovirus expression system (pAcClS, is derived from pAcC12 which is described 
byMunemitsuS.,etal.,Afo/Ce//5/W. 10(ll):5977-5982, 1990), 

Briefly, construction of pCMVPLEdhfr was as follows. 

To construct a DHFR cassette, the EMCV IRES (internal ribosome entry site) 

15 leader was PCR-amplified from pCite-4a+ (Novagen, Inc., Milwaukee, WI) and 

inserted into pET-23d (Novagen, Inc., Milwaukee, WI) as an Xba-Nco fragment to 
give pET-EMCV. The dhfr gene was PCR-amplified from pESN2dhfr to give a 
product with a Gly-Gly-Gly-Ser spac^ in place of the translation stop codon and 
inserted as an Nco-BamHX fragment to give pET-E-DHFR. Next, the attenuated neo 

20 gene was PCR amplified from a pSV2Neo (Clontech, Palo Alto, CA) derivative and 
inserted into the unique BamH\ site of pET-E-DHFR to give pET-E-DHFR/Neo^^j)- 
Finally the bovine growth hormone terminator fit)m pCDNAS (Invitrogen, Inc., 
Carlsbad, CA) was inserted downstream of the neo gene to give pET-E- 
DHFR/Neo^^yBGHt. The EMCV-dhJr/neo selectable marker cassette fragment was 

25 prepared by cleavage of pET-E-DHFR/Neo^„2)BGHt. 

The CMV enhancer/promoter plus Intron A was transferred from pCMV6a 
(Chapman et al., Nuc. Acids Res. (1991) 19:3979-3986) as a HindHl-Sall fragment 
into pUC19 (New England Biolabs, Inc., Beverly, MA). The vector backbone of 
pUC19 was deleted from the Ndel to the S^l sites. The above described DHFR 

30 cassette was added to the construct such that the EMCV IRES followed the CMV 

promoter. The vector also contained an amp' gene and an SV40 origin of replication. 
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B. Defining nf the Maior Homolopv R egion fMHR) of HIV-1 pSSGae 

The Major Homology Region (MHR) of HIV-1 p55 (Gag) is located in the 
p24.CA sequence of Gag. It is a conserved stretdi of approximately 20 amino acids. 
The position in the wild type AFl 10965 Gag protein is fix)m 282-301 (SEQ ID 
5 NO:25) and spans a region from 844-903 (SEQ ID NO:26) for the Gag DNA- 

sequence. The position in the synthetic Gag protem is also &om 282-301 (SEQ ID 
NO:25) and spans a region from 844-903 (SEQ ID NO:l) for the synthetic Gag DNA- 
sequence. The position in the wild type and synthetic AFl 10967 Gag protein is from 
281-300 (SEQ ID NO:27) and spans a region from 841-900 (SEQ ID NO:2) for the 

10 modified Gag DNA-sequence. Mutations or deletions in the MHR can severely 
impair particle production (Borsetti, A., et al., J, Virol 72(1 1):93 13-93 1 7, 1998; 
Mammano, F., et al., J Virol 68(8):4927-4936, 1994). 

Percent identity to this sequence can be determined, for example, using the 
Smith-Waterman search algorithm (Time Logic, Incline Village, hfV), with the 

1 5 following exemplary parameters: weight matrix = nuc4x4hb; gap opening penalty = 
20, gap extension penalty = 5. 

C. Definin|g; of the Common Sequence Region of HIV-1 Env 

The common sequence region (CSR) of HIV-1 Env is located in the C4 
20 sequence of Env. It is a conserved stretch of approximately 47 amino acids. The 

position in the wild type and synthetic AFl 10968 Env protein is from approximately 
amino acid residue 405 to 451 (SEQ ID NO:28) and spans a region from 1213 to 1353 
(SEQ ID NO:5) for the Biv DNA-sequence. The position in the wild type and 
synthetic AFl 10975 Env protein is from approximately amino acid residue 404 to 451 
25 (SEQ ID NO:29) and spans a region from 1210 to 1353 (SEQ ID NO:l 1) for the Env 
DNA-sequence. 

Percent identity to this sequence can be determined, for example, using the 
Smith-Waterman search algorithm (Time Logic, Incline Village, NV), with the 
following exemplary parameters: weight matrix = nuc4x4hb; gap opening penalty = 
30 20, gap extension penaXXy = 5. 

Various forms of the different embodiments of the invention, described herein, 
may be combined. 
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Example 2 

Expression Assays for the Synthetic Coding Sequences 
A. Ehv. Gag and Gag-Protease Coding Sequences 
5 The wild-type Eny (fix)m AFl 10968 or AFl 10975), Gag (fiom AFl 10965 and 

AFl 10967) and Gag-protease (from AFl 10965 and AFl 10967) sequences are cloned 
into expression yectors haying the same features as the yectors into which the 
synthetic Eny, Gag and Gag-protease sequences are cloned. 

Expression efficiencies for yarious yectors carrying the wild-type and 

10 synthetic Eny and Gag sequences are eyaluated as follows. Cells firom seyeral 

mammalian cell lines (293, RD, COS-7, and CHO; all obtained from the American 
Type Culture Collection, 10801 University Boulevard, Manassas, VA 20 11 0-2209) 
are transfected with 2 of DNA in transfection reagent LTl (Pan Vera Corporation, 
545 Science Dr., Madison, WI). The cells are incubated for 5 hours in reduced serum 

15 medium (Opti-MEM, Gibco-BRL, Gaithersburg, MD). The medium is then replaced 
with normal medium as follows: 293 cells, IMDM, 10% fetal calf serum, 2% 
glutamine (BioWhittaker, Walkersville, MD); RD and COS-7 cells, D-MEM, 10% 
fetal calf serum, 2% glutamine (Opti-MEM, Gibco-BRL, Gaithersburg, MD); and 
CHO cells. Ham's F-12, 10% fetal calf serum, 2% glutamine (Opti-MEM, Gibco- 

20 BRL, Gaithersburg, MD). The cells are incubated for either 48 or 60 hours. Cell 

lysates are collected as described below in Example 3. Supematants are harvested and 
filtered through 0.45 fim syringe filters. Supematants are evaluated using the Coulter 
p24-assay (Coulter Corporation, Hialeah, FL, US), using 96-well plates coated with a 
murine monoclonal antibody directed against HIV core antigen. The HTV-l p24 

25 antigen binds to the coated wells. Biotinylated antibodies against HIV recognize the 
bound p24 antigen. Conjugated strepavidin-horseradish peroxidase reacts with the 
biotin. Color develops firom the reaction of peroxidase with TMB substrate. The 
reaction is terminated by addition of 4N HjSO^. The intensity of the color is directly 
proportional to the amount of HTV p24 antigen in a sample. 

30 Synthetic Env, Gag and Gag-protease expression cassettes provides dramatic 

increases in production of their protein products, relative to the native (wild-type Type 
C) sequences, when expressed in a variety of cell lines. 
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Example 3 |. 
Western Blot Analysis of Expression | 
A, Env. Gag a nd Gag-Protease Codinp Sequences I 
5 Human 293 cells are transfected as described in Example 2 with pCMV6a- | 

based vectors containing native or synthetic Env or Gag eq)ression cassettes. Cells 
are cultivated for 60 hours post-transfection. Supematants are prepared as described. I 
Cell lysates are prepared as follows. The cells are washed once with phosphate- j 
buffered saline, lysed with detergent [1% NP40 (Sigma Chemical Co., St. Louis, MO) | 

10 in 0.1 MTYis-HCl,pH 7.5], and the lysate transferred into fi^s SDS- I' 

polyacrylamide gels (pre-cast 8-16%; Novex, San Diego, CA) are loaded with 20 jil 
of supernatant or 12.5 ^il of cell lysate. A protein standard is also loaded (5 ^1, broad 
size range standard; BioRad Laboratories, Hercules, CA). Electrophoresis is carried 
out and the proteins are transferred using a BioRad Transfer Chamber (BioRad 

1 5 Laboratories, Hercules, CA) to Immobilon P membranes (Millipore Corp., Bedford, 
MA) using the transfer buffer recommended by the manufacturer (Millipore), where 
the transfer is performed at 100 volts for 90 minutes. The membranes are exposed to 
HTV-l -positive human patient serum and immunostained using o-phenylenediamine 
dihydrochloride (OPD; Sigma). 

20 Immunoblotting analysis shows that cells containing the synthetic Env or Gag 

expression cassette produce the expected protein at higher per-cell concentrations than 
cells containing the native expression cassette. The proteins are seen in both cell 
lysates and supematants. The levels of production are significantly higher in cell 
supematants for cells transfected with the synthetic expression cassettes of the present 
25 invention. 

In addition, supematants from the transfected 293 cells are fractionated on 
sucrose gradients. Ahquots of the supernatant are transferred to Polyclear™ ultra- 
centrifuge tubes (Beckman Instruments, Columbia, MD), under-laid with a solution of 
20% (wt/wt) sucrose, and subjected to 2 hours centrifugation at 28,000 rpm in a 
30 Beckman SW28 rotor. The resulting pellet is suspended in PBS and layered onto a 
20-60% (wt/wt) sucrose gradient and subjected to 2 hours centrifugation at 40,000 
rpm in a Beckman SW41ti rotor. 
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The gradient is then fiactionated into approximately 10 x 1 ml aliquots 
(starting at the top, 20%-end, of the gradient). Samples are taken from fractions 1-9 
and are electrophoresed on 8-16% SDS polyacrylamide gels. The supematants from 
293/synthetic Env or Gag cells give much stronger bands than supematants from 
5 293/native Env or Gag cells. 



Example 4 

In Vivo Immunogenicitv of Synthetic Gag and Env Expression Cassettes 
A. Immunization 

10 To evaluate the possibly improved immunogenicity of the synthetic Gag and 

Env expression cassettes, a mouse study is performed. The plasmid DNA, 
pCMVKM2 carrying the synthetic Gag expression cassette, is diluted to the following 
final concentrations in a total injection volume of 100 ^1: 20 ^g, 2 ^ig, 0.2 ng, 0.02 
and 0.002 jig. To overcome possible negative dilution effects of the diluted DNA, the 

15 total DNA concentration in each sample is brought up to 20 ^g using the vector 
(pCMVKM2) alone. As a control, plasmid DNA of the native Gag expression 
cassette is handled in the same manner. Twelve groups of four to ten Balb/c mice 
(Charles River, Boston, MA) are intramuscularly immimized (50 \i\ per leg, 
intramuscular injection into the tibialis anterior) according to the schedule in Table 1. 
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Table 1 





Group 


Gag or Env 
Expression Cassette 


Concentration of Gag 
or Env plasmid DNA 
(Hg) 


Immunized at time 
(weeks): 




1 


Synthetic 


20 


0',4 




2 


Sjmthetic 


2 


0.4 


5 


3 


Synthetic 


0.2 


0,4 




4 


Synthetic 


0.02 


0,4 




5 


Synthetic 


0.002 


0,4 




6 


Synthetic 


20 


0 




7 


Synthetic 


2 


0 


10 


8 


Synthetic 


0.2 


0 




9 


Synthetic 


0.02 


0 




10 


Synthetic 


0.002 


0 




11 


Native 


20 


0.4 




12 


Native 


2 


0,4 


15 


13 


Native 


0.2 


0.4 




14 


Native 


0.02 


0,4 




15 


Native 


0.002 


0.4 




16 


Native 


20 


0 




17 


Native 


2 


0 


20 


18 


Native 


0.2 


0 




19 


Native 


0.02 


0 




20 


Native 


0.002 


0 



1 = initial immunization at "week 0" 



Groups 1-5 and 1 1-15 are bled at week 0 (before immunization), week 4, week 
6, week 8, and week 12. Groups 6-20 and 16-20 are bled at week 0 (before 
immunization) and at week 4. 
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B. Humoral Immune Response 

The humoral immime response is checked with an anti-HIV Gag or Env 
antibody ELISAs (enz3mie-linked immunosorbent assays) of the mice sera 0 and 4 
weeks post immunization (groiqps 5-12) and, in addition, 6 and 8 weeks post 
5 immunization, respectively, 2 and 4 weeks post second immunization (groups 1-4). 
The antibody titers of the s^ are determined by anti-Gag or anti-Env 
antibody ELISA. Briefly, sera from immunized mice are screened for antibodies 
directed against the HIV p55 Gag protein or an Env protein, e.g., gpl60 or gpl20. 
ELISA microtiter plates are coated with 0.2 fig of Gag or Env protein per well 

10 overnight and washed four times; subsequently, blocking is done with PBS-0.2% 

Tween (Sigma) for 2 hours. After removal of the blocking solution, 100 \i\ of diluted 
mouse serum is added. Sera are tested at 1/25 dilutions and by serial 3-fold dilutions, 
thereafter. Microtiter plates are washed four times and incubated with a secondary, 
peroxidase-coupled anti-mouse IgG antibody (Pierce, Rockford, XL). ELISA plates 

15 are washed and 100 \i\ of 3, 3\ 5, 5'-tetramethyl benzidine (TMB; Pierce) is added per 
well. The optical density of each well is measured after 15 minutes. The titers 
reported are the reciprocal of the dilution of serum that gave a half-maximum optical 
density (O.D.). 

Synthetic expression cassettes will provide a clear improvement of 
20 immunogenicity relative to the native expression cassettes. 

C. Cellular Immune Response 

The frequCTcy of specific cytotoxic T-lymphocytes (CTL) is evaluated by a 
standard chromium release assay of peptide pulsed Balb/c mouse CD4 cells. Gag or 

25 Env expressing vaccinia virus infected CD-8 cells are used as a positive control. 
Briefly, spleen cells (Effector cells, E) are obtained from the BALB/c mice 
immunized as described above are cultured, restimulated, and assayed for CTL 
activity against Gag peptide-pulsed target cells as described (Doe, B., and Walker, 
CM., AIDS 10(7):793-794, 1996). Cytotoxic activity is measured in a standard ^'Cr 

30 release assay. Target (T) cells are cultured with effector (E) cells at various E:T ratios 
for 4 hours and the average cpm from duplicate wells are used to calculate percent 
specific ^'Cr release. 
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Cytotoxic T-cell (CTL) activity is measured in splenocytes recovered fiom the 
mice immunized with HIV Gag or Env DNA. Effector cells from the Gag or Env 
DNA-immunized animals exhibit specific lysis of Gag or Env peptide-pulsed SV- 
BALB (MHC matched) targets cells, indicative of a CTL response. Target cells that 
5 are peptide-pulsed and derived from an MHC-unmatched mouse strain (MC57) are 
not lysed. 

Thus, synthetic Env and Gag expression cassettes exhibit increased potency 
for induction of cytotoxic T-lymphocyte (CTL) responses by DNA immunization. 

10 Example 5 

DNA-immunization of Non-Human Primates Using a Synthetic Env or Gag 

Expression Cassette 
Non-human primates are immunized multiple times (e.g., weeks 0, 4, 8 and 
24) intradermally, mucosally or bilaterally, intramuscular, into the quadriceps using 
15 various doses (e.g., 1-5 mg) synthetic Gag- and/or Env-containing plasmids. The 
animals are bled two weeks after each immunization and ELISA is performed with 
isolated plasma. The ELISA is performed essentially as described in Example 4 
except the second antibody-conjugate is an anti-human IgG, g-chain specific, 
peroxidase conjugate (Sigma Chemical Co., St. Louis, MD 63178) used at a dilution 
20 of 1 :500. Fifty \ig/m\ yeast extract is added to the dilutions of plasma samples and 
antibody conjugate to reduce non-specific background due to preexisting yeast 
antibodies in the non-human primates. 

Fiuther, lymphoproliferative responses to antigen can also be evaluated post- 
inmiunization, indicative of induction of T-helper cell Amotions. 
25 Both synthetic Env and Gag plasmid DNA is expected to be immunogenic in 

non-human primates. 

Example 6 

In vitro expression of recombinant Sindbi s RNA and DNA containing the synthetic 
^0 Env and Gag expression cassette 

To evaluate the expression efficiency of the synthetic Env and Gag 
expression cassette in Alphavirus vectors, the selected synthetic expression cassette is 
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subcloned into both plasmid DNA-based and recombinant vector particle-based 
Sindbis virus vectors. Specifically, a cDNA vector construct for in vitro transcription 
of Sindbis vims RNA vector replicons (pRSIN-luc; Dubensky, et al., J Virol 70:508- 
519, 1996) is modified to contain a Pmel site for plasmid linearization and a 
5 polylink^ for insertion of heterologous genes. A polylinker is generated using two 
oligonucleotides that contain the sites AZroI, Pmll, Apal^ iVarl, Xbal^ andNotl 
(XPANXNF, and XPANXNR). 

The plasmid pRSIN-luc (Dubensky et aL, supra) is digested with Xhol and 
Notl to remove the luciferase gene insert, blunt-ended using Klenow and dNTPs, and 

10 purified fi'om an agarose get using GeneCleanll (BiolOl, Vista, CA). The 

oligonucleotides are annealed to each other and ligated into the plasmid. The 
resulting construct is digested with Notl and Sad to remove the minimal Sindbis 3'- 
end sequence and A40 tract, and ligated with an approximately 0.4 kbp fragment from 
PKSSINl-BV (WO 97/38087). This 0.4 kbp fragment is obtained by digestion of 

15 pKSSINl-BV with Notl and Sad, and purification after size fiactionation from an 

agarose gel. The fragment contains the complete Sindbis virus 3*-end, an A40 tract and 
a Pmel site for linearization. This new vector construct is designated SINBVE. 

The synthetic HIV Gag and Env coding sequences are obtained fix>m the 
parental plasmid by digestion with EcoRI, blunt-ending with Klenow and dNTPs, 

20 purification with GeneCleanll, digestion with Sail, size fractionation on an agarose 
gel, and purification from the agarose gel using GeneCleanll. The synthetic Gag or 
Env coding fragment is ligated into the SINBVE vector that is digested with Xhol and 
PmtL The resulting vector is purified using GeneCleanll and is designated 
SINBVGag. Vector RNA replicons may be transcribed in vitro (Dubensky et al., 

25 supra) fix>m SINBVGag and used directly for transfection of cells. Alternatively, the 
replicons may be packaged into recombinant vector particles by co-transfection with 
defective helper RNAs or using an alphavirus packaging cell line. 

The DNA-based Sindbis virus vector pDCMVSIN-beta-gal (Dubensky, et al., 
J Virol 70:508-519, 1996) is digested with Saa and AIwiI, to remove the beta- 

30 galactosidase gene insert, and purified using GeneCleanll after agarose gel size 

finctionation. The HTV Gag or Env gene is inserted into the the pDCMVSIN-beta-gal 
by digestion of SINBVGag with Sail and Xhol, purification using GeneCleanll of the 
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Gag-containing fragment after agarose gel size fractionation, and ligation. The 
resulting construct is designated pDSIN-Gag, and may be used directly for in vivo 
administration or formulated using any of the methods described herein. 

BHK and 293 cells are transfected with recombinant Sindbis RNA and DNA, 
5 respectively. The suponatants and cell lysates are tested with the Coulter capture 
ELISA (Example 2). 

BHK cells are transfected by electroporation with recombinant Sindbis RNA. 

293 cells are transfected using LT-1 (Example 2) with recombinant Sindbis 
DNA. Synthetic Gag- and/or Env-containing plasmids are used as positive controls. 
1 0 Supematants and lysates are collected 48h post transfection. 

Gag and Env proteins can be efficiently expressed from both DNA and RNA- 
based Sindbis vector systems using the synthetic expression cassettes. 

Example 7 

15 In Vivo Immunopeni citv of recombinant Sindbis Reolicon Vectors containing 

synthetic Gag and/or Env Expression Cassettes 
A. Immunization 

To evaluate the immimogenicity of recombinant synttietic Gag and Env 
expression cassettes in Sindbis replicons, a mouse study is performed. The Sindbis 

20 virus DNA vector carrying the synthetic Gag and/or Env expression cassette 

(Example 6), is diluted to the following final concentrations in a total injection 
volume of 100 nl: 20 jig, 2 ^ig, 0.2 ^g, 0.02 and 0.002 ^ig. To overcome possible 
negative dilution effects of the diluted DNA, the total DNA concentration in each 
sample is brought up to 20 jig using the Sindbis replicon vector DNA alone. Twelve 

25 groups of four to ten Balb/c mice (Charles River, Boston, MA) are intramuscularly 

immunized (50 jil per leg, intramuscular injection into the tibialis anterior) according 
to the schedule in Table 2. Alternatively, Sindbis viral particles are prepared at the 
following doses: 10^ pfu, 10* pfu and 10^ pfu in 100 jil, as shown in Table 3. Sindbis 
Env or Gag particle preparations are administ^^ to mice using intramuscular and 

30 subcutaneous routes (50 jil per site). 
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Table 2 



Group 


Gag or Env 
Expression Cassette 


Concentration of Gag 
or Env DNA (^g) 


Immunized at time 
(wedcs): 


1 


Synthetic 


20 


0',4 


2 


Synthetic 


2 


0,4 


3 


Synthetic 


0.2 


0,4 


4 


Synthetic 


0.02 


0.4 


5 


Synthetic 


0.002 


0,4 


6 


Synthetic 


20 


0 


7 


Synthetic 


2 


0 


8 


Synthetic 


0.2 


0 


9 


Synthetic 


0.02 


0 


10 


Synthetic 


0.002 


0 



1 = initial inmiunization at "week 0" 



15 Table 3 



Group 


Gag or Env sequence 


Concentration of viral 
particle (pfb) 


Immunized at time 
(weeks): 


1 


Synthetic 


lO' 


0',4 


2 


Synthetic 


10* 


0,4 


3 


Synthetic 


10' 


0,4 


8 


Synthetic 


10^ 


0 


9 


Synthetic 


10* 


0 


10 


Synthetic 


10' 


0 . 



1 = initial immunization at "week 0*' 

Groups are bled and assessment of both himioral and cellular (e.g., frequency 
25 of specific CTLs) is performed, essentially as described in Example 4. 



Although preferred embodiments of the subject invention have been described 
in some detail, it is imderstood that obvious variations can be made without dq>arting 
from the spirit and the scope of the invention as defined by the appended claims. 
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CLAIMS 

1 . An expression cassette, comprising 

a polynucleotide sequence encoding a polyp^tide including an HIV Gag 
5 polypeptide, wherein the polynucleotide sequrace encoding said Gag polypeptide 

comprises a sequence having at least 90% sequence identity to the sequence presented 
as either nucleotides 844-903 of Figure 1 (SEQ ID NO:l) or nucleotides 841-900 of 
Figure 2 (SEQ ID NO:2). 

10 2. An expression cassette, comprising 

a polynucleotide sequence encoding a polypeptide including an HIV Gag 
polypeptide, wherein the polynucleotide sequence encoding said Gag polypeptide 
comprises a sequence having at least 90% sequence identity to the sequence presented 
as Figure 1 (SEQ ID NO:3) or Figure 2 (SEQ ID NO:4). 

3. The expression cassette of claim 2, wherein said polynucleotide sequence 
encoding a polypeptide including an HIV Gag^ polypeptide comprises a sequence 
having at least 90% sequence identity to the sequence presented as Figure 1 (SEQ ID 
NO:3). 

4. The expression cassette of claim 2, wherein said polynucleotide sequence 
encoding a polypeptide including an HTV Gag polypqptide comprises a sequence 
having at least 90% sequence identity to the sequence presented as Figure 2 (SEQ ED 
NO:4). 

5. The expression cassette of claim 2, wherein the polynucleotide sequence 
encoding said Gag polypeptide consists of a sequence having the sequence presented 
as Figure 1 (SEQ ID NO:3). 

6. The expression cassette of claim 2, wherein the polynucleotide sequence 
encoding said Gag polypeptide consists of a sequence having the sequence presented 
as Figure 2 (SEQ ID NO:4). 
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7. The e>q>re5sion cassette of any of claims 1 to 6, wherein said 
polynucleotide sequence further includes a polynucleotide sequence encoding an HTV 
protease polypeptide. 

5 

8, The expression cassette of any of claims 1 to 6, wherein said 
polynucleotide sequence further includes a polynucleotide sequence encoding an HIV 
po/jmerose polypeptide. 

10 9. TTie expression cassette of any of claims 1 to 6, wherein said 

polynucleotide sequence further includes a polynucleotide sequence encoding an HIV 
polymerase polypeptide, wherein the sequence encoding the HIV polymerase 
polypeptide is modified by deletions of coding regions corresponding to reverse 
transcriptase and integrase. 

15 

10. The expression cassette of claim 9, wherein said polynucleotide 
sequence preserves T-helper cell and CTL epitopes. 

1 1 ; An expression cassette, comprising a polynucleotide sequence 
20 encoding a polypq>tide including an HIV Env polypeptide, wherein the 

polynucleotide sequence ^coding said Env polypeptide comprises a sequence having 
at least 90% sequence identity to the sequence presented as nucleotides 1213-1353 of 
Figure 3 (SEQ ID NO:5). 

25 12. The expression cassette of claim 1 1 , wherein the polynucleotide 

sequence encoding said jE/iv polypeptide further comprises a sequence having at least 
90% sequence identity to the sequence presented as nucleotides 82-1512 of Figure 3 
(SEQ ID NO:6). 

30 13. The expression cassette of claim 1 1, wherein the polynucleotide 

sequence encoding said Env polypeptide further comprises a sequence having at least 
90% sequence identity to the sequence presented as nucleotides 82-2025 of Figure 3 
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(SEQIDNO:?). 



14. The expression cassette ofclaim 11, wherein the polynucleotide 
sequence encoding said Env polypeptide further comprises a sequence having at least 
90% sequCTce identity to the sequence presented as nucleotides 82-2547 of Figure 3 
(SEQIDNO:8), 

15. The expression cassette of claim 1 1 , wherein the polynucleotide 
sequence encoding said Env polypeptide further comprises a sequence having at least 
90% sequCTce identity to the sequence presented as nucleotides 1-2547 of Figure 3 
(SEQIDNO:9). 



16. An expression cassette, comprising a polynucleotide sequence 
encoding a polypeptide including an HIV Env polypeptide, wherein the 

1 5 polynucleotide sequence encoding said Env polypeptide comprises a sequence having 
at least 90% sequence identity to the sequence presented as nucleotides 1513-2547 of 
Figure 3 (SEQ ID NO: 10). 

17. An expression cassette, comprising a polynucleotide sequence 
20 encoding a polypeptide including an HIV Env polypeptide, wherein the 

polynucleotide sequence encoding said Env polypeptide comprises a sequence having 
at least 90% sequence identity to the sequence presented as nucleotides 1210-1353 of 
Figure 4 (SEQ ID NO:l 1). 

25 18. The expression cassette of claim 1 7, wherein the polynucleotide 

sequence encoding said jpnv polypeptide further comprises a sequence having at least 
90% sequence identity to the sequence presented as nucleotides 73-1509 of Figure 4 
(SEQ ID NO:12). 

30 1 9. The expression cassette of claim 1 7, wherein the polynucleotide 

sequence encoding said Env polypeptide further comprises a sequence having at least 
90% sequence identity to the sequence presented as nucleotides 73-2022 of Figure 4 
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(SEQIDNO:13). 

20. The expression cassette of claim 17, wherein the polynucleotide 
sequence encoding said Env polypeptide further comprises a sequence having at least 

5 90% sequence identity to the sequrace presented as nucleotides 73-2S6S of Figure 4 
(SEQIDNO:14). 

2 1 . The expression cassette of claim 1 7, wherein the polynucleotide 
sequence encoding said Env polypeptide further comprises a sequence having at least 

10 90% sequence identity to the sequence presented as nucleotides 1-2565 of Figure 4 
(SEQIDNO:15). 

22. An expression cassette, comprising a polynucleotide sequence 
encoding a polypeptide including an HIV Env polypeptide, wherein the 

15 polynucleotide sequence encoding said Env polypeptide comprises a sequence having 
at least 90% sequence identity to the sequence presented as nucleotides 1510-2565 of 
Figure 4 (SEQ ID NO: 16). 

23. An expression cassette, comprising a polynucleotide sequence 
20 encoding a polypeptide including an HIV Env polypeptide, wherein the 

polynucleotide sequence encoding said Env polypeptide consists of a sequence having 
the sequence presented as Figure 3 (SEQ ID NO:9) or Figure 4 (SEQ ID NO: 15). 

24. A recombinant expression system for use in a selected host cell, 
25 comprising, an expression cassette of any of claims 1-23, and wherein said 

polynucleotide sequence is operably linked to control elements compatible with 
expression in the selected host cell. 

25« The recombinant expression system of claim 24, wh^in said control 
30 elements are selected from the group consisting of a transcription promoter, a 

transcription enhancer element, a transcription termination signal, polyadenylation 
sequences, sequences for optimization of initiation of translation, and translation 
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termination sequences. 

26. The recombinant expression system of claim 24, wherein said 
transcription promoter is selected fiom the group consisting of CMV, CMV+intron A, 

5 SV40, RSV, mV-Ltr, MMLV-ltr, and metallothionein. 

27. A cell comprising an expression cassette of any of claims 1-23, and 
wherein said polynucleotide sequence is operably linked to control elements 
compatible with expression in the selected cell. 

10 

28. The cell of claim 27, wherein the cell is a mammalian cell. 

29. The cell of claim 28, wherein the cell is selected from the group 
consisting of BHK, VERO, HT1080, 293, RD, COS-7, and CHO cells. 

15 

30. The cell of claim 29, wherein said cell is a CHO cell. 



3 1 . The cell of claim 27, wherein the cell is an insect cell. 

20 32. The cell of claim 31 , wherein the cell is either Trichoplusia ni (Tn5) or 

Sf9 insect cells. 



33. The ceU of claim 27, wherein the cell is a bacterial cell. 

34. The cell of claim 27, wherein the cell is a yeast cell. 



35. The cell of claim 27, wherein the cell is a plant cell. 

36. The cell of claim 27, wherein the cell is an antigen presenting cell. 

30 

37. The cell of claim 36, wherein the lymphoid cell is selected from the group 
consisting of macrophage, monocytes, dendritic cells, B-cells, T-cells, stem cells, and 
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progenitor cells thereof. 

38. The cell of claim 27, wherein the cell is a primary cell. 
5 39. The cell of claim 27, wherein the cell is an immortalized cell. 

40. The cell of claim 27, wherein the cell is a tumor-derived cell. 

41 . A composition for generating an immunological response, comprising: 
10 an expression cassette of any of claims 1-10. 

42. The composition of claim 41 , further comprising a Gag polypeptide 
encoded by at least one of the expression cassettes of any of claims 1-10. 

15 43 . The composition of claim 4 1 , further comprising an adjuvant. 

44. A composition for generating an immunological response, comprising: 
an expression cassette of any of claims 1 1-23. 

20 45. The composition of claim 44, further comprising an Env polypeptide 

encoded by at least one of the expression cassettes of any of claims 1 1-23. 

46. The composition of ei&er claim 44 or 45 further comprising: 
an expression cassette of any of claims 1-10. 

25 

47. The composition of claim 46, further comprising the Gag polypeptide 
encoded by at least one expression cassette of any of claims 1-10. 

48. The composition of claim 44, further comprising an adjuvant. 

30 

49. A method of immunization of a subject, comprising, 
introducing a composition of any of claims 41-48 into said subject under 
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conditions that are compatible with expression of said expression cassette in said | 
subject H 



5 using a gene delivery vector. 

5 1 . The method of claim 50, wherein the gene delivery vector is a non-viral 



vector. 



10 52. The method of claim 50, wherein said gene delivery vector is a viral 

vector. 

53. The method of claim 52, wherein said gene delivery vector is a Sindbis- 
virus derived vector. 

15 

54. The method of claim 52, wherein said gene delivery vector is a retroviral 

vector. 

55. The method of claim 52, wherein said gene delivery vector is a lenti viral 

20 vector. 



50. The method of claim 49, wh^in said expression cassette is introduced | 



56. The method of claim 49, wh^ein said composition delivered using a 
particulate carrier. 

57. The method of claim 49, wherein said composition is coated on a gold or 
tungsten particle and said coated particle is delivered to said subject using a gene gun. 

58. The method of claim 49, wherein said composition is enc^sulated in a 
liposome preparation. 

59. The method of any of claims 49-58, wherein said subject is a mammal. 
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60. Hie method of claim 59, wherein said mammal is a human. 

6 1 . A method of generating an immune response in a subject, comprising: 
providing an expression cassette of any of claims 1-23, 

S expressing said polypeptide in a suitable host cell, 

isolating said polypeptide, and 

administering said polypeptide to the subject in an amount sufficient to elicit 
an immune response. 

10 62. A method of generating an immune response in a subject, comprising 

introducing into cells of said subject an expression cassette of any of claims 1- 
23, under conditions that permit the expression of said polynucleotide and production 
of said polypeptide, thereby eliciting an immunological response to said polypeptide. 

15 63. The method of claim 62, where the method further comprises 

administration of a polypeptide produced by an expression cassette according to any 
of claims 1-23 to the subject. 

64. The method of claim 63, wherein administration of the polypeptide to the 
20 subject is carried out before introducing said expression cassette. 

65. The method of claim 63, wherein administration of the polypeptide to the 
subject is carried out concurrently with introducing said expression cassette. 

25 66. The method of claim 63, wherein administration of the polypeptide to the 

subject is carried out after introducing said expression cassette. 
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Gat>F110965_BW_mo<i 

Jl!rGGGCGCCCGCGCCAGCATCX?rGCCKX3GCGGCAAGCTG<^ 

TGOSCCCaSGCGGCyUVGAAGTGCTACATGATGAAGCACCTGG 

GGAGAAGTTOSCCCTGAACXXXXiGCXSTGCTGGAGACC^ 

CGCCAGCTGCUVCCCCGC(XTGCAGACCGGC2VGCG71G6AG^ 

TGGCCIVCXXn^jSZ&CTGCGTGCiVCGAGAAGATC^ 

CSAGATCXSAGGftGGAGOMSJUVOU^iGTGCCftGC^ 

aagggcaaggtgagccjigaactacccxsitcgtgcagaacctgc^ 

agga::atcagcccccgca<x:ctgaacgcctgggtgaaggtgata5aggagaagg<^tt(^^ 

c(Xx:gaggtgatccccatgttcaccgccctgagcgagggcgccaccccccaggacctgaac 

acgatgttgaacaccgtgggcggccaccaggccgccatgcagatgcnxst^gga^ 

acgaggaggcosc^agtgggac^gcgtgcaccccgtgcacgcas^ 

cc3vgatgcgc6agcc(x:gcggcagc6acatcga:ggcaccaccagc7v(xxn:gcagg^ 

atcgcctggatqaccagcaaca:cxx;catcccc6tgggcgacatctacaagcggtggal^ 

TCCTGGGCCTGAACAAGATCGTGCG6ATGTACAGCCCCGTGAGCATCCTGGACATCAAGCA 

gggccccaaggagcccttcx:gcgactacgtggaccccttctt^ 

CAGAGCACXX^AGGAGGTGAAGAACTOGATGACaSACAC^^ 

CCGACTGCAAGACCATCCnKSOSCGCTCTCGGCCXXI^GC^^ 

CGCXrrGCX:AGGGCGTGGGa3GCX:CC3VGOCAC»AGGCCCGCGTGC^ 

CAGGCCAACACCAGCGT6ATGATGCAGAAGAGCAACnTC»AGGGGaXXX3GC^^ 

A!(^K3CTTCAACTGCGGCAAGGAGGGCCACATCGCXX»GAACTGCC^^ 

GGGCTGCrGGAAGTGOSGCAAGGAGGGCCACCAGATGAAGGACTGCAC^^ 

AACTTa;TGGGaU^GATCTGGCCCAGCCAC»AiGGGCCGCXX;CGGa^ 

GCCCaaGCCCACOKXXDCCCCCGCCGAGAGCTTCCCCTTCGAGQAGACCA^ 

GAAGCAGGAGAGCAAGGACCGCGAGACCCTGACCAGCCTGAAGAGCCTGTTCGGCAACGAC 

CCCCTGAX5CCAGTAA 
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atgggcgcx:cgcgccagc»tcxtgasc«gggaga^ 
tgcgojccggcggcaagjagcactacatgctcwic^ 
6gagggcttcgcxx:tgja<xxxjggcc^^ 
aagcagctgcagccxgccxtgcagaccggcaccgaggagct 

TGGOaiCCXITGTACTGCGTGaVCGCCGGCATCG^ 

aUl6ATC6M3GAG6AGCA6MGAAGT0Ca^GCAGA^ 

GGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGC^ 

CCATCAGCCCCCGOIICCCTGAACGCCTGGGTGAAGGTGATCGAG^ 

CGAGGTQATCCXXaVTGTTCACX:GCCCTGAGCGAGGGCGCCAC^^ 

ATGTTGAACAa:GTGGGCGGCGACCAGGCCGCCATGC^ 

AGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCAGGCCGGCCCOSTGGCCCC^ 

GATGCGC^ACCCCCGCGGCAGCGAOVTCGCCGGCGCCACC^^ 

GCCTGGATGAC5CAGC7UVCCCCCCC»TGCCCGTGG6C^ 

TGGGCCTGAACAA6ATCGTGCGGATGTACAGCCCCGT6AGCATCCTGGACATCCGCCAGGG 
CCCCAAGGAGCCCTTTOGCGACTACGTGGACCGCTTCTTCAAGAC^ 
GCCACCCAGGACGTGAAGAACTGGATGA(^GAGACX:CTGCTGGTGCAGAACGCC^ 
ACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCXACC^ 

ctgccagggcgtgggcggccccggccacaaggccx:gcgtgctggccgaggcgatgagc^ 

GCCAACAGCGTGAACMCATGATGCAGAAGAGC3UVCTTCAA6GGCCCCCGG 

AGTGCTTCAACTGCGGCaAGGAGGGCCACATOGCCJ^ 

GGGCTGCTGGAAGTGCGGCIU^GGAGGGCCACCAGAT^^ 

AACTTCCTGGGCAAGATCTGGa:CAGCCy^CAAGGGC(:^^ 

GCAGCGAGCCCGCGGCCCCCACXX3TGCXXaiCCGCCX:CCCCCGCCGAG 

G6AGACCACCCCCGCCCCCAAGCAGGAGCCCAAGGACCGCGAGCCCTACCGCGAGCCCCTG 

ACCGCCCTGCGCAGCCTGTTCGGCAGCGGCCCCCTGAGCCAGTAA 
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Fig. 3 

Env_AF110968_C_BW_opt 
~> signal popUdd (1-81)^ 

ATGCGCGTGATGGGCATCCTGAAGMCTACCAGCAGTGGTGGATGTGGGGCATCCTGGGCTTCTGGATGCTGATCA 

\/ — > gpl20/140/160 (82) 
TCAGC/USCGTGGtGGGCAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCC^ 

GTTCTGCACCAGCGACGCCAAGGCCTACGAGACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCC^ 

GACC(XAACCCCCAGGAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACAT6TGGAAGAACGACATGGTGGAC^ 

AGATGCACGAGGACATOITCAGCCTGTGGGACXZAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGC^ 

CCTGAAGTGCCGCAACGTGAACGCCACCAACAACATCAACAGCATGATCGACAAC^ 

AACTGCAGCTTCAACGTGACCACCGAGCTGCGCGACCGCAAGCAGGAGGTGCACGCCCTGTTCTACCGCCT 

TGGTGCCCCTGCAGGGCAACAACAGCAACGAGTACCGCCTGATCAACTGCAACACXrAGCGCCATCACCC^ 

CCCCAAGGTGAGCTTCGACCCCATCCCCATCCACTACTGCACCCCCGCCGGCTACGCCATCCTGAAGTGCAACAAC 

CAGACCTTCAACGGCACCGGCCCCTGCAACAACGTGAGCAGCGTGCAGTGCGCCCACGGCATCAAGCCCGTGGTGA 

GCACCCAGCTGCTGCTGAACGGCAGCCTGGCCAAGGGCGAGATCATCATCCGCAGCGAGAACCTGGCCAACAACGC 

CAAGATCATCATCGTGCAGCTGAAO^GCCCGTGAAGATCGTGTGCGTGCGCCCCAACAACAAC^ 

GTGCGCyVTCGGCCCCGGCCAGACCTTCTACGCCACCGGCGAGATCATCGGCGACATCCGCCAGGCCTACrrGCAT^ 

TCAACAAGACCGAGTGGMCAGCACCCTGCAGGGCGTGAGCAAGAAGCTGGAGGAGCACTTCAGCAAGAAGGCCAT 

CAAGTTCGAGCCCAGCAGCGGCGGCGACCTGGAGATCACCACCCACAGCTTC7VACTGCCGCGGCGAGTTCTTCTAC 

TGCGACACCAGCCAGCTGTTCAACAGCACCTACAGCCCCAGCTTCAACGGCACCGAGAACAAGCTGAACGGCACCA 

TCACCATCACCTGCCGCATCAAGCAGATCATCAACATGTGGCAGAAGGTGGGCCGCGCCATGTACGCCCCCCCCAT 

CGCCGGCAACCTGACCTGCGAGAGOUICATCACCGGCCTGCTGCTGACCCGCGACGGCGGC^ 

GACACCGAGATCTTCCGCCCXGGCX^GCGGCGACATGCGCGAaUVCTGGCGCAACGAGCTGTAC^ 

TGGAGATCAAGCCCCTGGGCGTGGCCCCCACCGAGGCCAAGCGCCGCGTGGTGG^^GO^ 

CATCGGCGCCGTGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCGCCAGCATCACCCTGACCGTG 

CAGGCCCGCCTGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGACCCGCATCCTGGCCGTGGAGCGCTACCTGAAGGACCA 

GCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACAGCAGCTGGAGC 

AACCGCA6CCACGACGAGATCTGGGACAACATGACCTGGATGCAGTGGGACCGCGAGATCAACAACTACACCGACA 

CCATCTACCGCCTGCTGGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGAAGGACCTGCTGGCCCTGGACAGCTG 

gpl40(2025)< — \/ 

GCAGAACCTGTGGAACTGGTTCAGCATCACCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGG6CGGC 

CTGATCGGCCTGCGCATCATCTTCGCCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGCCCT 

TCCAGACCCTGACCCCCAACCCCCGCGAGCCCGACCGCCTGGGCCGCATCGAGGAGGAGGGCGGCGAGCAGGACCG 

CGGCCGCAGCATCCGCCTGGTGAGCGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGC 

TACCACCGCCTGCGCGACTTCATCCTGATCGCCGCCCGCGTGCTGGAGCTGCTGGGCCAGCGCGGCTGGGAGGCCC 

TGAAGTACCTGGGCAGCCTGGTGCAGTACTGGGGCCTGGAGCTGAAGAAGAGCGCCATCAGCCTGCTGGACACCAT 

CGCCATCGCCGTGGCCGAGGGCACCGACC6CATCATCGAGTTCATCCAGCGCATCTGCCGCGCCATCCGCAACATC 

gpl60, gp41(25«7)< — \ 
CCCCGCCGCATCCGCCAGGGCTTCGAGGCCGCCCTGCAGTAA 
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Fig. 4 

Env_AF110975_C_BW_opt 

— > signal p^tldo (1*72) \/ — > 

ATGCGCGTOCGCGGCATCCTGCGCAGCnrGGCAGCAGTGGTGGATCTGGGGCATCCTGGGCTTCTGGATCT^^ 
gpl20/140/160 (72) 

GCCTGGGCAACCTGTGGGTGACCGTGTACGACGGCGTGCCCGTGTGGCGCGAGGCCAGCACCACCCT 

CAGCGACGCCAAGGCCTACGAGAAGGAGGTGCACAACGTGTGGGCXIACCCACGCCTGCGTGCCCA 

CCCCAGGAGATCGAGCTGGACAACGTGACCGAGAACTTCAACATGTGGAAGAAC^CATGGTGGACU 

AGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCCGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAAGTG 

CACCAACTACAGCACCAACTACAGCAACACCATGMCGCCACCAGCTACAACAAC^ 

AACTGCACCTTCAACATGACCACCGAGCTGCGCGACAAGAAGCAGCAGGTGTACGCCCTGTTCTACAAGCTG(^ 

TCGTGCCCCTGAACAGCAACAGCAGCGAGTACCGCCTGATCAACTGCAACACCAGCGCCATCACCC^ 

CAAGGTGAGCTTCGACCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTACGCCATCCTGAAGTGC/^GAACAAC 

ACCAGCAACGGCACCGGCCCCTGCCAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCAAGCCCGTGGTGAGCA 

CCCCCCTGCTGCTGAACGGCAGCCTGGCCGAGGGCGGCGAGATCATCATCCGCAGCAAGAACCTGAGCAACAACGC 

CTACACCATCATCGTGCACCTGAACGACAGCGTGGAGATCGTGTGCACCCGCCCCAACAACAACACCCGCAAGGGC 

ATCCGCATCGGCCCCGGCCAGACCTTCTACGCCACCGAGAACATCATCGGCGACATCCGCCAGGCCCACTGCAACA 

TCAGCGCCGGCGAGTGGAACAAGGCCGTGCAGCGCGTGAGCGCCAAGCTGCGCGAGCACTTCCCCAACAAGACCAT 

CGAGTTCCAGCCCAGCAGCGGCGGCGACCTGGAGATCACCACCCACAGCTTCAACTGCCGCGGCGAGTTCTTCTAC 

TGCAACACCAGCAAGCTGTTCAACAGCAGCTACAACGGCACCAGCTACCGCGGCACCGAGAGCAACAGCAGCATCA 

TCACCCTGCCCTGCCGCATCAAGCAGATCATCGACATGTGGCAGAAGGTGGGCCGCGCCATCrrACGCCCCCCCCA^ 

CGAGGGCAACATCACXTGCAGCAGCAGCATCACCGGCCTGCTGCTGGCCCGCGACGGCGG^ 

ACCGAGATCTTCCGCCCCCAGGGCGGCGACATGAAGGACAACTGGCGCAACGAGCTGTACAAGTACAAGGT(^ 

gpl20 (1509X — \/ — >(l5l0>gp41 

AGATCAAGCCCCTGGGCGTGGCCCCCACCGAGGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCAT 

CGGCGCCGTGATCTTCGGCTTCCTGGGCGCCGCCGGCAGCAACATGGGCGCCGCCAGCATCACCCTGACCGCCCAG 

GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAGCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACATGC 

TGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCrrGGCCATCGAGCGCTACC^^ 

GCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGC^ 

AAGACCCAGGGCGAGATCTGGGAGAACATGACCTGGATGCAGTGGGACAAGGAGATCAGCAACTACACCG^^ 

TCTACCGCCTGCTGGAGGAGAGCCAGAACCAGCAGGAGCAGAACGAGAAGGACCTGCTGGCCCTGGACAGCCGCAA 

9pl40(2022)< — \/ 

CAACCTGTGGAGCTGGTTCAACATCAGCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTG 
ATCGGCCTGCGCATCATCTTCGCCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCC 
AGACCCTGACCCCCAACCCCCGCGGCCTGGACCGCCTGGGCCGCATCGAGGAGGAGGGCGGCGAGCAGGACCGCGA 
CCGCAGCATCCGCCTGGTGCAGGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTAC 

AGCGCGGCTGGGAGGCCCTGAAGTACCTGGGCAGCCTGGTGCAGTACTGGGGCCTGGAGCTGAAGAAGAGCGCCAC 

- CA6CCTGCTGGACAGCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGATCCAGCGCATCTAC 

gpl60, gp41(2565)< — \ 
CGCGCCTTCTGCAACATCCCCCGCCGCGTGCGCCAGGGCTTCGAGGCCGCCCTGCAGTAA 
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ATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTCGGAGCGC^^ 

C»GCAAGAAGTCCTACATGKTCJUlGCACCTGGTGTGGGCCAGCOGCGAGCT^^ 

CCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGAICSITCCGCCAGCTGC^ 

AGOGAGGAGCTGAAGAGCCWJTTCAACIVCCOT^^ 

COACACCAAGQAGGCCCTGGACAIVCSATCGAGGAGG&GCiV^ 

CAGGCCMCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGT^^ 



GATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTCAACAC CATGC TGAACACCGTGG 

o 5 



GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGOGTG 
CACCCCGTGCACGOCOGCCCmTCGOrOCCGGCOlGATGC^ 

CACCAGCACCCTCCAGGA6CAGATCGGCTGGATGACCAGCAACCCC«XyVTCCCCGTGGGC» 

AGa|^GGATCATCCT6GGCCTGAACAAGATCGTGC<^ 

CAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCmXAJ^ 

CCAGGIVGGTGAAGAACTGGATGACC^ACACCCTOCTCGTGC^^ 
J— 1 

WCGCW<fcClgGGCXXXGGCGCCAGCCTGGAGGAGATG^ 

CACAAGGCCCGCGTGCTGGCCGAGOC^TGAGCCAGGCCAACACCAGGGTGATGATGC^GAAGAC 

CAAGGGCCCCCC^^GCATCG^yAAGTCCnfT^^ 

CCXXXXKSCAAOMGGGCTGCTGOAAGTGCGGC^ 

GCCAACTTCXrrGGGCJUlGAirCtGOOCCAGCCA^ 

GCCCW^CGCMCCCCCCCCGAGAGCTTCCGCTTC 

ACCGCGAGACCCTGACCAGCCTGAAGAGCCTGTTC6GCAACGACCCCCTGAGCCAGTAA 



'figure 5 
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MX3GQOGC«XK5GCCft^ 

CCGGCCTCCX<5<»GACCGCCGAGGGCTGaAGCA^ 
ACCGAOGAGCTGCGCAGCC5OTACAACACa5TCGCX5ACCX^^ 



CGACACCAMOIVGGCCX^TGGACItfVGATCgAGGJlGGIVGCAGIU^ 

ACGAG6CCGACGGCAJW5GTGIU5CCJU5AACTACCCCA^^ 

GCCATCAGCCCCCGCACCCaKSAACGCCTGGGTGAAGGTGATC^ 



a:CCATGTTCACCGCCCTGA^:GAGGGCCCCACX:CCCCAGGACCTGAA»^ rGAACACCGTGGGCG 

'G ^ 



GCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCtfUVCGAGGA^^ 

CCCGTGCAGGCCGGCCCCGTGGCCCCXGGCCAGATGCGCGACCCCaSCGGCAGCGACATCGCC^ 

CAGCACCOTGCAGGAGCAGATCGCCTGCATGACCAG(»ACCCC<XCGTGCCCGTGGGCGACATCT 

(^GGATOITCCTGGGCCTGAACAAGATCGTGCC^ 

GGCCCCAAGGAGCCCTTCCGCGACTACGTGGACaSOTTCT^^ 

GGACGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGC^ 

GCGc jcCT^ GGCX^CCGGCGCCACCCTGGAGG^^ 
AAGGCCCGCGTGCTGGCCGAGG<QvTGAGCCAGGCCAACAGaM 

CAAGGG<XCX:C^:»3CAACGiy^GT^ 

<XJCXJCCGCAAGAAGGGCTGCTGGAAGTGCGGC1UIGGAGGGCCACCAGATGAAG 
GCCyUlCTTCCTGGGCAAGATCTGGCCCAGCCACAAGGGCCGC^^ 



ccaagcaq6agcxx»aggaccgcgagccctacx:gcgagco^^ 
ggccccctgagccagtaa 
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SEQUENCE LISTING 



<110> Chiron Corporation 

<120> POLYNUCLEOTIDES ENCODING ANTIGENIC HIV TYPE C 
POLYPEPTIDES, POLYPEPTIDES AND USES THEREOF 

<130> 1631.100 

<140> 
<141> 

<150> 60/152,195 
<151> 1999-09-01 

<160> 29 . 

<170> Patent In Ver, 2.0 

<210> 1 
<211> 60 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 1 

gacatcaagc agggccccaa ggagcccttc cgcgactacg tggaccgctt cttcaagacc 60 

<210> 2 
<211> 60 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 2 

gacatccgcc agggccccaa ggagcccttc cgcgactacg tggaccgbtt cttcaagacc 60 

<210> 3 
<211> 1479 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Gag 
of HIV strain AF110965 * 

<400> 3 

atgggcgccc gcgccagcat cctgcgcggc ggcaagctgg acgcctggga gcgcatccgc 60 
ctgcgccccg gcggcaagaa gtgctacatg atgaagcacc tggtgtgggc cagccgcgag 120 
ctggagaagt tcgccctgaa ccccggcctg ctggagacca gcgagggctg caagcagatc 180 
atccgccagc tgcaccccgc cctgcagacc ggcagcgagg agctgaagag cctgttcaac 24 0 
accgtggcca ccctgtactg cgtgcacgag aagatcgagg tccgcgacac caaggaggcc 300 
ctggacaaga tcgaggagga gcagaacaag tgccagcaga agatccagca ggccgaggcc 360 
gccgacaagg gcaaggtgag ccagaactac cccatcgtgc agaacctgca gggccagatg 420 
gtgcaccagg ccatcagccc ccgcaccctg aacgcctggg tgaaggtgat cgaggagaag 480 
gccttcagcc ccgaggtgat ccccatgttc accgccctga gcgagggcgc caccccccag 540 
gacctgaaca cgatgttgaa caccgtgggc ggccaccagg ccgccatgca gatgctgaag 600 
gacaccatca acgaggaggc cgccgagtgg gaccgcgtgc accccgtgca cgccggcccc 660 
atcgcccccg gccagatgcg cgagccccgc ggcagcgaca tcgccggcac caccagcacc 720 
ctgcaggagc agatcgcctg gatgaccagc aaccccccca tccccgtggg cgacatctac 780 



1 
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aagcggtgga tcatcctggg cctgaacaag atcgtgcgga tgtacagccc cgtgagcatc 840 
ctggacatca agcagggccc caaggagccc ttccgcgact acgtggaccg cttcttcaag 900 
accctgcgcg ccgagcagag cacccaggag gtgaagaact ggatgaccga caccctgctg 960 
gtgcagaacg ccaaccccga ctgcaagacc atcctgcgcg ctctcggccc cggcgccagc 1020 
ctggaggaga tgatgaccgc ctgccagggc gtgggcggcc ccagccacaa ggcccgcgtg 1080 
ctggccgagg cgatgagcca ggccaacacc agcgtgatga tgcagaagag caacttcaag 1140 
ggcccccggc gcatcgtcaa gtgcttcaac tgcggcaagg agggccacat cgcccgcaac 1200 
tgccgcgccc cccgcaagaa gggctgctgg aagtgcggca aggagggcca ccagatgaag 1260 
gactgcaccg agcgccaggc caacttcctg ggcaagatct ggcccagcca caagggccgc 1320 
cccggcaact tcctgcagag ccgccccgag cccaccgccc cccccgccga gagcttccgc 1380 
ttcgaggaga ccacccccgg ccagaagcag gagagcaagg accgcgagac cctgaccagc 1440 
ctgaagagcc tgttcggcaa cgaccccctg agccagtaa 1479 

<210> 4 
<211> 1509 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Gag 
of HIV strain AP110967 

<400> 4 

atgggcgccc gcgccagcat cctgcgcggc gagaagctgg acaagtggga gaagatccgc 6 0 
ctgcgccccg gcggcaagaa gcactacatg ctgaagcacc tggtgtgggc cagccgcgag 120 
ctggagggct tcgccctgaa ccccggcctg ctggagaccg ccgagggctg caagcagatc 180 
atgaagcagc tgcagcccgc cctgcagacc ggcaccgagg agctgcgcag cctgtacaac 240 
accgtggcca ccctgtactg cgtgcacgcc ggcatcgagg tccgcgacac caaggaggcc 3 00 
ctggacaaga tcgaggagga gcagaacaag tcccagcaga agacccagca ggccaaggag 360 
gccgacggca aggtgagcca gaactacccc atcgtgcaga acctgcaggg ccagatggtg 420 
caccaggcca tcagcccccg caccctgaac gcctgggtga aggtgatcga ggagaaggcc 4 80 
ttcagccccg aggtgatccc catgttcacc gccctgagcg agggcgccac cccccaggac 54 0 
ctgaacacga tgttgaacac cgtgggcggc caccaggccg ccatgcagat gctgaaggac 600 
accatcaacg aggaggccgc cgagtgggac cgcctgcacc ccgtgcaggc cggccccgtg 660 
gcccccggcc agatgcgcga cccccgcggc agcgacatcg ccggcgccac cagcaccctg 720 
caggagcaga tcgcctggat gaccagcaac ccccccgtgc ccgtgggcga catctacaag 780 
cggtggatca tcctgggcct gaacaagatc gtgcggatgt acagccccgt gagcatcctg 840 
gacatccgcc agggccccaa ggagcccttc cgcgactacg tggaccgctt cttcaagacc 900 
ctgcgcgccg agcaggccac ccaggacgtg aagaactgga tgaccgagac cctgctggtg 960 
cagaacgcca accccgactg caagaccatc ctgcgcgctc tcggccccgg cgccaccctg 1020 
gaggagatga tgaccgcctg ccagggcgtg ggcggccccg gccacaaggc ccgcgtgctg 1080 
gccgaggcga tgagccaggc caacagcgtg aacatcatga tgcagaagag caacttcaag 114 0 
ggcccccggc gcaacgtcaa gtgcttcaac tgcggcaagg agggccacat cgccaagaac 1200 
tgccgcgccc cccgcaagaa gggctgctgg aagtgcggca aggagggcca ccagatgaag 1260 
gactgcaccg agcgccaggc caacttcctg ggcaagatct ggcccagcca caagggccgc 1320 
cccggcaact tcctgcagaa ccgcagcgag cccgccgccc ccaccgtgcc caccgccccc 1380 
cccgccgaga gcttccgctt cgaggagacc acccccgccc ccaagcagga gcccaaggac 144 0 
cgcgagccct accgcgagcc cctgaccgcc ctgcgcagcc tgttcggcag cggccccctg 1500 
agccagtaa 1509 

<210> 5 
<211> 141 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Env common 
region of HIV strain AF110968 
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<400> 5 

accat caeca tcacctgccg catcaagcag atcatcaaca tgtggcagaa ggtgggccgc 60 
gccatgtacg ccccccccat cgccggcaac ctgacctgcg agagcaacat caccggcctg 120 
ctgctgaccc gcgacggcgg c 141 

<210> 6 
<211> 1431 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
gpl20 coding region of HIV strain AF110968 

<400> 6 

agcgtggtgg gcaacctgtg ggtgaccgtg tactacggcg tgcccgtgtg gaaggaggcc 60 
aagaccaccc tgttctgcac cagcgacgcc aaggcctacg agaccgaggt gcacaacgtg 120 
tgggccaccc acgcctgcgt gcccaccgac cccaaccccc aggagatcgt gctggagaac 180 
gtgaccgaga acttcaacat gtggaagaac gacatggtgg accagatgca cgaggacatc 240 
atcagcctgt gggaccagag cctgaagccc tgcgtgaagc tgacccccct gtgcgtgacc 300 
ctgaagtgcc gcaacgtgaa cgccaccaac aacatcaaca gcatgatcga caacagcaac 360 
aagggcgaga tgaagaactg cagcttcaac gtgaccaccg agctgcgcga ccgcaagcag 420 
gaggtgcacg ccctgttcta ccgcctggac gtggtgcccc tgcagggcaa caacagcaac 4 80 
gagtaccgcc tgatcaactg caacaccagc gccatcaccc aggcctgccc caaggtgagc 540 
ttcgacccca tccccatcca ctactgcacc cccgccggct acgccatcct gaagtgcaac 600 
aaccagacct tcaacggcac cggcccctgc aacaacgtga gcagcgtgca gtgcgcccac 660 
ggcatcaagc ccgtggtgag cacccagctg ctgctgaacg gcagcctggc caagggcgag 720 
atcatcatcc gcagcgagaa cctggccaac aacgccaaga teat cat cgt gcagctgaac 780 
aagcccgtga agatcgtgtg cgtgcgcccc aacaacaaca cccgcaagag cgtgcgcatc 84 0 
ggccccggcc agaccttcta cgccaccggc gagatcatcg gcgacatccg ccaggcctac 900 
tgcatcatca acaagaccga gtggaacagc accctgcagg gcgtgagcaa gaagctggag 960 
gagcacttca gcaagaaggc catcaagttc gagcccagca gcggcggcga cctggagatc 1020 
accacccaca gcttcaactg ccgcggcgag ttcttctact gcgacaccag ccagctgttc 1080 
aacagcacct acagccccag cttcaacggc accgagaaca agctgaacgg caccatcacc 1140 
atcacctgcc gcatcaagca gatcatcaac atgtggcaga aggtgggccg cgccatgtac 1200 
gcccccccca tcgccggcaa cctgacctgc gagagcaaca tcaccggcct gctgctgacc 1260 
cgcgacggcg gcaagaccgg ccccaacgac accgagatct tccgccccgg cggcggcgac 1320 
atgcgcgaca actggcgcaa cgagctgtac aagtacaagg tggtggagat caagcccctg 13 80 
ggcgtggccc ccaccgaggc caagcgccgc gtggtggagc gcgagaagcg c 1431 

<210> 7 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
gpl4 0 coding region of HIV strain AF110968 

<400> 7 

agcgtggtgg gcaacctgtg ggtgaccgtg tactacggcg tgcccgtgtg gaaggaggcc 60 
aagaccaccc tgttctgcac cagcgacgcc aaggcctacg agaccgaggt gcacaacgtg 120 
tgggccaccc acgcctgcgt gcccaccgac cccaaccccc aggagatcgt gctggagaac 180 
gtgaccgaga acttcaacat gtggaagaac gacatggtgg accagatgca cgaggacatc 24 0 
atcagcctgt gggaccagag cctgaagccc tgcgtgaagc tgacccccct gtgcgtgacc 300 
ctgaagtgcc gcaacgtgaa cgccaccaac aacatcaaca gcatgatcga caacagcaac 360 
aagggcgaga tgaagaactg cagcttcaac gtgaccaccg agctgcgcga ccgcaagcag 420 
gaggtgcacg ccctgttcta ccgcctggac gtggtgcccc tgcagggcaa caacagcaac 480 
gagtaccgcc tgatcaactg caacaccagc gccatcaccc aggcctgccc caaggtgagc 540 
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ttcgacccca 
aaccagacct 
ggcatcaagc 
atcatcatcc 
aagcccgtga 
ggccccggcc 
tgcatcatca 
gagcacttca 
accacccaca 
aacagcacct 
atcacctgcc 
gcccccccca 
cgcgacggcg 
atgcgcgaca 
ggcgtggccc 
atcggcgccg 
atcaccctga 
ctgctgcgcg 
cagctgcaga 
atctggggct 
agcaaccgca 
atcaacaact 
aagaacgaga 
atcaccaact 



tccccatcca 
tcaacggcac 
ccgtggtgag 
gcagcgagaa 
agatcgtgtg 
agaccttcta 
acaagaccga 
gcaagaaggc 
gcttcaactg 
acagccccag 
gcatcaagca 
tcgccggcaa 
gcaagaccgg 
actggcgcaa 
ccaccgaggc 
tgttcctggg 
ccgtgcaggc 
ccatcgaggc 
cccgcatcct 
gcagcggcaa 
gccacgacga 
acaccgacac 
aggacctgct 
ggctgtggta 



ctactgcacc 
cggcccctgc 
cacccagctg 
cctggccaac 
cgtgcgcccc 
cgccaccggc 
gtggaacagc 
catcaagttc 
ccgcggcgag 
ctt.caacggc 
gatcatcaac 
cctgacctgc 
ccccaacgac 
cgagctgtac 
caagcgccgc 
cttcctgggc 
ccgcctgctg 
ccagcagcac 
ggccgtggag 
gctgatctgc 
gatctgggac 
catctaccgc 
ggccctggac 
catc 



cccgccggct 
aacaacgtga 
ctgctgaacg 
aacgccaaga 
aacaacaaca 
gagatcatcg 
accctgcagg 
gagcccagca 
ttcttctact 
accgagaaca 
atgtggcaga 
gagagcaaca 
accgagatct 
aagtacaagg 
gtggtggagc 
gccgccggca 
ctgagcggca 
ctgctgcagc 
cgctacctga 
accaccgccg 
aacatgacct 
ctgctggagg 
agctggcaga 



acgccatcct 
gcagcgtgca 
gcagcctggc 
tcatcatcgt 
cccgcaagag 
gcgacatccg 
gcgtgagcaa 
gcggcggcga 
gcgacaccag 
agctgaacgg 
aggtgggccg 
tcaccggcct 
tccgccccgg 
tggtggagat 
gcgagaagcg 
gcaccatggg 
tcgtgcagca 
tgaccgtgtg 
aggaccagca 
tgccctggaa 
ggatgcagtg 
agagccagaa 
acctgtggaa 



gaagtgcaac 
gtgcgcccac 
caagggcgag 
gcagctgaac 
cgtgcgcatc 
ccaggcctac 
gaagctggag 
cctggagatc 
ccagctgttc 
caccatcacc 
cgccatgtac 
gctgctgacc 
cggcggcgac 
caagcccctg 
cgccgtgggc 
cgccgccagc 
gcagaacaac 
gggcatcaag 
gctgctgggc 
cagcagctgg 
ggaccgcgag 
ccagcaggag 
ctggttcagc 



600 

660 

720 

780 

84 0 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1944 



<210> 8 
<211> 2466 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of /Artificial Sequence: synthetic 
gpl60 coding region of HIV strain AF110968 



<4.00> 8 

agcgtggtgg 

aagaccaccc 

tgggccaccc 

gtgaccgaga 

atcagcctgt 

ctgaagtgcc 

aagggcgaga 

gaggtgcacg 

gagtaccgcc 

ttcgacccca 

aaccagacct 

ggcatcaagc 

atcatcatcc 

aagcccgtga 

ggccccggcc 

tgcatcatca 

gagcacttca 

accacccaca 

aacagcacct 

atcacctgcc 

gcccccccca 

cgcgacggcg 

atgcgcgaca 

ggcgtggccc 



gcaacctgtg 
tgttctgcac 
acgcctgcgt 
acttcaacat 
gggaccagag 
gcaacgtgaa 
tgaagaactg 
ccctgttcta 
tgatcaactg 
tccccatcca 
tcaacggcac 
ccgtggtgag 
gcagcgagaa 
agatcgtgtg 
agaccttcta 
acaagaccga 
gcaagaaggc 
gcttcaactg 
acagccccag 
gcatcaagca 
tcgccggcaa 
gcaagaccgg 
actggcgcaa 
ccaccgaggc 



ggtgaccgtg 
cagcgacgcc 
gcccaccgac 
gtggaagaac 
cctgaagccc 
cgccaccaac 
cagcttcaac 
ccgcctggac 
caacaccagc 
ctactgcacc 
cggcccctgc 
cacccagctg 
cctggccaac 
cgtgcgcccc 
cgccaccggc 
gtggaacagc 
catcaagttc 
ccgcggcgag 
cttcaacggc 
gatcatcaac 
cctgacctgc 
ccccaacgac 
cgagctgtac 
caagcgccgc 



tactacggcg 
aaggcctacg 
cccaaccccc 
gacatggtgg 
tgcgtgaagc 
aacatcaaca 
gtgaccaccg 
gtggtgcccc 
gccatcaccc 
cccgccggct 
aacaacgtga 
ctgctgaacg 
aacgccaaga 
aacaacaaca 
gagatcatcg 
accctgcagg 
gagcccagca 
ttcttctact 
accgagaaca 
atgtggcaga 
gagagcaaca 
accgagatct 
aagtacaagg 
gtggtggagc 



tgcccgtgtg 
agaccgaggt 
aggagatcgt 
accagatgca 
tgacccccct 
gcatgatcga 
agctgcgcga 
tgcagggcaa 
aggcctgccc 
acgccatcct 
gcagcgtgca 
gcagcctggc 
tcatcatcgt 
cccgcaagag 
gcgacatccg 
gcgtgagcaa 
9cggcggcga 
gcgacaccag 
agctgaacgg 
aggtgggccg 
tcaccggcct 
tccgccccgg 
tggtggagat 
gcgagaagcg 



gaaggaggcc 
gcacaacgtg 
gctggagaac 
cgaggacatc 
gtgcgtgacc 
caacagcaac 
ccgcaagcag 
caacagcaac 
caaggtgagc 
gaagtgcaac 
gtgcgcccac 
caagggcgag 
gcagctgaac 
cgtgcgcatc 
ccaggcctac 
gaagctggag 
cctggagatc 
ccagctgttc 
caccatcacc 
cgccatgtac 
gctgctgacc 
cggcggcgac 
caagcccctg 
cgccgtgggc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 
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atcggcgccg tgttcctggg cttcctgggc 
atcaccctga ccgtgcaggc ccgcctgctg 
ctgctgcgcg ccatcgaggc ccagcagcac 
cagctgcaga cccgcatcct ggccgtggag 
atctggggct gcagcggcaa gctgatctgc 
agcaaccgca gccacgacga gatctgggac 
atcaacaact acaccgacac catctaccgc 
aagaacgaga aggacctgct ggccctggac 
atcaccaact ggctgtggta catcaagatc 
ctgcgcatca tcttcgccgt gctgagcatc 
ctgcccttcc agaccctgac ccccaacccc 
gaggagggcg gcgagcagga ccgcggccgc 
ctggcctggg acgacctgcg cagcctgtgc 
atcctgatcg ccgcccgcgt gctggagctg 
tacctgggca gcctggtgca gtactggggc 
ctggacacca tcgccatcgc cgtggccgag 
cgcatctgcc gcgccatccg caacatcccc 
ctgcag 

<210> 9 
<211> 2547 
<212> DNA 

<213> Artificial Sequence 



gccgccggca gcaccatggg cgccgccagc 1500 
ctgagcggca tcgtgcagca gcagaacaac 1560 
ctgctgcagc tgaccgtgtg gggcatcaag 1620 
cgctacctga aggaccagca gctgctgggc 1680 
accaccgccg tgccctggaa cagcagctgg 1740 
aacatgacct ggatgcagtg ggaccgcgag 1800 
ctgctggagg agagccagaa ccagcaggag 1860 
agctggcaga acctgtggaa ctggttcagc 1920 
ttcatcatga tcgtgggcgg cctgatcggc 1980 
gtgaaccgcg tgcgccaggg ctacagcccc 2040 
cgcgagcccg accgcctggg ccgcatcgag 2100 
agcatccgcc tggtgagcgg cttcctggcc 2160 
ctgttcagct accaccgcct gcgcgacttc 2220 
ctgggccagc gcggctggga ggccctgaag 2280 
ctggagctga agaagagcgc catcagcctg. 2340 
ggcaccgacc gcatcatcga gttcatccag 2400 
cgccgcatcc gccagggctt cgaggccgcc 2460 

2466 



<220> 

<223> Description of Artificial Sequence: synthetic 
signal sequence and gpl60 coding region of HIV 
strain AF110968 



<400> 9 

atgcgcgtga tgggcatcct gaagaactac 
ttctggatgc tgatcatcag cagcgtggtg 
gtgcccgtgt ggaaggaggc caagaccacc 
gagaccgagg tgcacaacgt gtgggccacc 
caggagatcg tgctggagaa cgtgaccgag 
gaccagatgc acgaggacat catcagcctg 
ctgacccccc tgtgcgtgac cctgaagtgc 
agcatgatcg acaacagcaa caagggcgag 
gagctgcgcg accgcaagca ggaggtgcac 
ctgcagggca acaacagcaa cgagtaccgc 
caggcctgcc ccaaggtgag cttcgacccc 
tacgccatcc tgaagtgcaa caaccagacc 
agcagcgtgc agtgcgccca cggcatcaag 
ggcagcctgg ccaagggcga gatcatcatc 
atcatcatcg tgcagctgaa caagcccgtg 
acccgcaaga gcgtgcgcat cggccccggc 
ggcgacatcc gccaggccta ctgcatcatc 
ggcgtgagca agaagctgga ggagcacttc 
agcggcggcg acctggagat caccacccac 
tgcgacacca gccagctgtt caacagcacc 
aagctgaacg gcaccatcac catcacctgc 
aaggtgggcc gcgccatgta cgcccccccc 
atcaccggcc tgctgctgac ccgcgacggc 
ttccgccccg gcggcggcga catgcgcgac 
9tggtggaga tcaagcccct gggcgtggcc 
cgcgagaagc gcgccgtggg catcggcgcc 
agcaccatgg gcgccgccag catcaccctg 
atcgtgcagc agcagaacaa cctgctgcgc 
ctgaccgtgt ggggcatcaa gcagctgcag 



cagcagtggt ggatgtgggg catcctgggc 60 
ggcaacctgt gggtgaccgt gtactacggc 12 0 
ctgttctgca ccagcgacgc caaggcctac 180 
cacgcctgcg tgcccaccga ccccaacccc 240 
aacttcaaca tgtggaagaa cgacatggtg 300 
tgggaccaga gcctgaagcc ctgcgtgaag 3 60 
cgcaacgtga acgccaccaa caacatcaac 420 
atgaagaact gcagcttcaa cgtgaccacc 4 80 
gccctgttct accgcctgga cgtggtgccc 540 
ctgatcaact gcaacaccag cgccatcacc 600 
atccccatcc actactgcac ccccgccggc 660 
ttcaacggca ccggcccctg caacaacgtg 720 
cccgtggtga gcacccagct gctgctgaac 780 
cgcagcgaga acctggccaa caacgccaag 840 
aagatcgtgt gcgtgcgccc caacaacaac 900 
cagaccttct acgccaccgg cgagatdatc 960 
aacaagaccg agtggaacag caccctgcag 1020 
agcaagaagg ccatcaagtt cgagcccagc 1080 
agcttcaact gccgcggcga gttcttctac 114 0 
tacagcccca gcttcaacgg caccgagaac 12 00 
cgcatcaagc agatcatcaa catgtggcag 1260 
atcgccggca acctgacctg cgagagcaac 1320 
ggcaagaccg gccccaacga caccgagatc 1380 
aactggcgca acgagctgta caagtacaag 1440 
cccaccgagg ccaagcgccg cgtggtggag 1500 
gtgttcctgg gcttcctggg cgccgccggc 1560 
accgtgcagg cccgcctgct gctgagcggc 162 0 
gccatcgagg cccagcagca cctgctgcag 1680 
acccgcatcc tggccgtgga gcgctacctg 1740 
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aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1800 
gtgccctgga acagcagct.g gagcaaccgc agccacgacg agatctggga caacatgacc 1860 
tggatgcagt gggaccgcga gatcaacaac tacaccgaca ccatctaccg cctgctggag 1920 
gagagccaga accagcagga gaagaacgag aaggacctgc tggccctgga cagctggcag 1980 
aacctgtgga actggttcag catcaccaac tggctgtggt acatcaagat cttcatcatg 2040 
atcgtgggcg gcctgatcgg cctgcgcatc atcttcgccg tgctgagcat cgtgaaccgc 2100 
gtgcgccagg gctacagccc cctgcccttc cagaccctga cccccaaccc ccgcgagccc 2160 
gaccgcctgg gccgcatcga ggaggagggc ggcgagcagg accgcggccg cagcatccgc 2220 
ctggtgagcg gcttcctggc cctggcctgg gacgacctgc gcagcctgtg cctgttcagc 2280 
taccaccgcc tgcgcgactt catcctgatc gccgcccgcg tgctggagct gctgggccag 2340 
cgcggctggg aggccctgaa gtacctgggc agcctggtgc agtactgggg cctggagctg 24 00 
aagaagagcg ccatcagcct gctggacacc atcgccatcg ccgtggccga gggcaccgac 2460 
cgcatcatcg agttcatcca gcgcatctgc cgcgccatcc gcaacatccc ccgccgcatc 2520 
cgccagggct tcgaggccgc cctgcag 2547 

<210> 10 ' 
<211> 1035 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic a 
gp41 coding region of HIV strain AF110968 

<400> 10 

gccgtgggca tcggcgccgt gttcctgggc ttcctgggcg ccgccggcag caccatgggc 60 
gccgccagca tcaccctgac cgtgcaggcc cgcctgctgc tgagcggcat cgtgcagcag 120 
cagaacaacc tgctgcgcgc catcgaggcc cagcagcacc tgctgcagct gaccgtgtgg 180 
ggcatcaagc agctgcagac ccgcatcctg gccgtggagc gctacctgaa ggaccagcag 240 
ctgctgggca tctggggctg cagcggcaag ctgatctgca ccaccgccgt gccctggaac 3 00 
agcagctgga gcaaccgcag ccacgacgag atctgggaca acatgacctg gatgcagtgg 360 
gaccgcgaga tcaacaacta caccgacacc atctaccgcc tgctggagga gagccagaac 42 0 
cagcaggaga agaacgagaa ggacctgctg gccctggaca gctggcagaa cctgtggaac 480 
tggttcagca tcaccaactg gctgtggtac atcaagatct tcatcatgat cgtgggcggc 540 
ctgatcggcc tgcgcatcat cttcgccgtg ctgagcatcg tgaaccgcgt gcgccagggc 600 
tacagccccc tgcccttcca gaccctgacc cccaaccccc gcgagcccga ccgcctgggc 660 
cgcatcgagg aggagggcgg cgagcaggac cgcggccgca gcatccgcct ggtgagcggc 720 
ttcctggccc tggcctggga cgacctgcgc agcctgtgcc tgttcagcta ccaccgcctg 7 80 
cgcgacttca tcctgatcgc cgcccgcgtg ctggagctgc tgggccagcg cggctgggag 84 0 
gccctgaagt acctgggcag cctggtgcag tactggggcc tggagctgaa gaagagcgcc 900 
atcagcctgc tggacaccat cgccatcgcc gtggccgagg gcaccgaccg catcatcgag 960 
ttcatccagc gcatctgccg cgccatccgc aacatccccc gccgcatccg ccagggcttc 1020 
gaggccgccc tgcag 1035 

<210> 11 
<211> 144 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Env 
common region of HIV strain AF110975 

<400> 11 

agcatcatca ccctgccctg ccgcatcaag cagatcatcg acatgtggca gaaggtgggc 60 
cgcgccatct acgccccccc catcgagggc aacatcacct gcagcagcag catcaccggc 120 
ctgctgctgg cccgcgacgg cggc 144 



<210> 12 
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<211> 1437 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
gpl20 coding region of HIV* strain AF110975 

<400> 12 

agcggcctgg gcaacctgtg ggtgaccgtg tacgacggcg tgcccgtgtg gcgcgaggcc 60 
agcaccaccc tgttctgcgc cagcgacgcc aaggcctacg agaaggaggt gcacaacgtg 120 
tgggccaccc acgcctgcgt gcccaccgac cccaaccccc aggagatcga gctggacaac 180 
gtgaccgaga acttcaacat gtggaagaac gacatggtgg accagatgca cgaggacatc 240 
atcagcctgt gggaccagag cctgaagccc cgcgtgaagc tgacccccct gtgcgtgacc 300 
ctgaagtgca ccaactacag caccaactac agcaacacca tgaacgccac cagctacaac 360 
aacaacacca ccgaggagat caagaactgc accttcaaca tgaccaccga gctgcgcgac 420 
aagaagcagc aggtgtacgc cctgttctac aagctggaca tcgtgcccct gaacagcaac 4 80 
agcagcgagt accgcctgat caactgcaac accagcgcca tcacccaggc ctgccccaag 54 0 
gtgagcttcg accccatccc catccactac tgcgcccccg ccggctacgc catcctgaag 600 
tgcaagaaca acaccagcaa cggcaccggc ccctgccaga acgtgagcac cgtgcagtgc 660 
acccacggca tcaagcccgt ggtgagcacc cccctgctgc tgaacggcag cctggccgag 720 
ggcggcgaga teat cat ccg cagcaagaac ctgagcaaca acgcctacac catcatcgtg 780 
cacctgaacg acagcgtgga gatcgtgtgc acccgcccca acaacaacac ccgcaagggc 84 0 
atccgcatcg gccccggcca gaccttctac gccaccgaga acatcatcgg cgacatccgc 900 
caggcccact gcaacatcag cgccggcgag tggaacaagg ccgtgcagcg cgtgagcgcc 960 
aagctgcgcg agcacttccc caacaagacc atcgagttcc agcccagcag cggcggcgac 1020 
ctggagatca ccacccacag cttcaactgc cgcggcgagt tcttctactg caacaccagc 1080 
aagctgttca acagcagcta caacggcacc agctaccgcg gcaccgagag caacagcagc 1140 
atcatcaccc tgccctgccg catcaagcag atcatcgaca tgtggcagaa ggtgggccgc 1200 
gccatctacg ccccccccat cgagggcaac atcacctgca gcagcagcat caccggcctg 1260 
ctgctggccc gcgacggcgg cctggacaac atcaccaccg agatcttccg cccccagggc 1320 
ggcgacatga aggacaactg gcgcaacgag ctgtacaagt acaaggtggt ggagatcaag 1380 
cccctgggcg tggcccccac cgaggccaag cgccgcgtgg tggagcgcga gaagcgc 1437 

<210> 13 

<211> 1950 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
gpl40 coding region of HIV strain AF110975 

<400> 13 

agcggcctgg gcaacctgtg ggtgaccgtg tacgacggcg tgcccgtgtg gcgcgaggcc 60 

agcaccaccc tgttctgcgc cagcgacgcc aaggcctacg agaaggaggt gcacaacgtg 120 
tgggccaccc acgcctgcgt gcccaccgac cccaaccccc aggagatcga gctggacaac 180 

gtgaccgaga acttcaacat gtggaagaac gacatggtgg accagatgca cgaggacatc 24 0 

atcagcctgt gggaccagag cctgaagccc cgcgtgaagc tgacccccct gtgcgtgacc 3 00 

ctgaagtgca ccaactacag caccaactac agcaacacca tgaacgccac cagctacaac 360 

aacaacacca ccgaggagat caagaactgc accttcaaca tgaccaccga gctgcgcgac 420 

aagaagcagc aggtgtacgc cctgttctac aagctggaca tcgtgcccct gaacagcaac 480 

agcagcgagt accgcctgat caactgcaac accagcgcca tcacccaggc ctgccccaag 540 

gtgagcttcg accccatccc catccactac tgcgcccccg ccggctacgc catcctgaag 600 

tgcaagaaca acaccagcaa cggcaccggc ccctgccaga acgtgagcac cgtgcagtgc 660 

acccacggca tcaagcccgt ggtgagcacc cccctgctgc tgaacggcag cctggccgag 72 0 

ggcggcgaga tcatcatccg cagcaagaac ctgagcaaca acgcctacac catcatcgtg 780 

cacctgaacg acagcgtgga gatcgtgtgc acccgcccca acaacaacac ccgcaagggc 84 0 

atccgcatcg gccccggcca gaccttctac gccaccgaga acatcatcgg cgacatccgc 900 
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caggcccact gcaacatcag cgccggcgag tggaacaagg ccgtgcagcg cgtgagcgcc 960 

aagctgcgcg agcacttccc caacaagacc atcgagttcc agcccagcag cggcggcgac 1020 

ctggagatca ccacccacag cttcaactgc cgcggcgagt tcttctactg caacaccagc 1080 

aagctgttca acagcagcta caacggcacc agctaccgcg gcaccgagag caacagcagc 1140 

atcatcaccc tgccctgccg catcaagcag atcatcgaca tgtggcagaa ggtgggccgc 1200 

gccatctacg ccccccccat cgagggcaac atcacctgca gcagcagcat caccggcctg 1260 

ctgctggccc gcgacggcgg cctggacaac atcaccaccg agatcttccg cccccagggc 1320 

ggcgacatga aggacaactg gcgcaacgag ctgtacaagt acaaggtggt ggagatcaag 1380 

cccctgggcg tggcccccac cgaggccaag cgccgcgtgg tggagcgcga gaagcgcgcc 1440 

gtgggcatcg gcgccgtgat cttcggcttc ctgggcgccg ccggcagcaa catgggcgcc 1500 

gccagcatca ccctgaccgc ccaggcccgc cagctgctga gcggcatcgt gcagcagcag 1560 

agcaacctgc tgcgcgccat cgaggcccag cagcacatgc tgcagctgac cgtgtggggc 162 0 

atcaagcagc tgcaggcccg cgtgctggcc atcgagcgct acctgaagga ccagcagctg 168 0 

ctgggcatct ggggctgcag cggcaagctg atctgcacca ccaccgtgcc ctggaacagc 1740 

agctggagca acaagaccca gggcgagatc tgggagaaca tgacctggat gcagtgggac 1800 

aaggagatca gcaactacac cggcatcatc taccgcctgc tggaggagag ccagaaccag 1860 

caggagcaga acgagaagga cctgctggcc ctggacagcc gcaacaacct gtggagctgg 1920 

ttcaacatca gcaactggct gtggtacatc 1950 

<210> 14 
<211> 2493 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
gpl60 coding region of HIV strain AF110975 

<400> 14 

agcggcctgg gcaacctgtg ggtgaccgtg tacgacggcg tgcccgtgtg gcgcgaggcc 60 
agcaccaccc tgttctgcgc cagcgacgcc aaggcctacg agaaggaggt gcacaacgtg 12 0 
tgggccaccc acgcctgcgt gcccaccgac cccaaccccc aggagatcga gctggacaac 180 
gtgaccgaga acttcaacat gtggaagaac gacatggtgg accagatgca cgaggacatc 24 0 
atcagcctgt gggaccagag cctgaagccc cgcgtgaagc tgacccccct gtgcgtgacc 300 
ctgaagtgca ccaactacag caccaactac agcaacacca tgaacgccac cagctacaac 360 
aacaacacca ccgaggagat caagaactgc accttcaaca tgaccaccga gctgcgcgac 420 
aagaagcagc aggtgtacgc cctgttctac aagctggaca tcgtgcccct gaacagcaac 480 
agcagcgagt accgcctgat caactgcaac accagcgcca tcacccaggc ctgccccaag 54 0 
gtgagcttcg accccatccc catccactac tgcgcccccg ccggctacgc catcctgaag 600 
tgcaagaaca acaccagcaa cggcaccggc ccctgccaga acgtgagcac cgtgcagtgc 660 
acccacggca tcaagcccgt ggtgagcacc cccctgctgc tgaacggcag cctggccgag 720 
ggcggcgaga tcatcatccg cagcaagaac ctgagcaaca acgcctacac catcatcgtg 780 
cacctgaacg acagcgtgga gatcgtgtgc acccgcccca acaacaacac ccgcaagggc 84 0 
atccgcatcg gccccggcca gaccttctac gccaccgaga acatcatcgg cgacatccgc 900 
caggcccact gcaacatcag cgccggcgag tggaacaagg ccgtgcagcg cgtgagcgcc 96 0 
aagctgcgcg agcacttccc caacaagacc atcgagttcc agcccagcag cggcggcgac 1020 
ctggagatca ccacccacag cttcaactgc cgcggcgagt tcttctactg caacaccagc 1080 
aagctgttca acagcagcta caacggcacc agctaccgcg gcaccgagag caacagcagc 114 0 
atcatcaccc tgccctgccg catcaagcag atcatcgaca tgtggcagaa ggtgggccgc 12 00 
gccatctacg ccccccccat cgagggcaac atcacctgca gcagcagcat caccggcctg 12 60 
ctgctggccc gcgacggcgg cctggacaac atcaccaccg agatcttccg cccccagggc 132 0 
ggcgacatga aggacaactg gcgcaacgag ctgtacaagt acaaggtggt ggagatcaag 1380 
cccctgggcg tggcccccac cgaggccaag cgccgcgtgg tggagcgcga gaagcgcgcc 1440 
gtgggcatcg gcgccgtgat cttcggcttc ctgggcgccg ccggcagcaa catgggcgcc 1500 
gccagcatca ccctgaccgc ccaggcccgc cagctgctga gcggcatcgt gcagcagcag 1560 
agcaacctgc tgcgcgccat cgaggcccag cagcacatgc tgcagctgac cgtgtggggc 162 0 
atcaagcagc tgcaggcccg cgtgctggcc atcgagcgct acctgaagga ccagcagctg 1680 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccaccgtgcc ctggaacagc 1740 
agctggagca acaagaccca gggcgagatc tgggagaaca tgacctggat gcagtgggac 1800 



8 
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aaggagatca gcaactacac cggcatcatc taccgcctgc tggaggagag ccagaaccag 1860 
caggagcaga acgagaagga cctgctggcc ctggacagcc gcaacaacct gtggagctgg 1920 
ttcaacatca gcaactggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 1980 
atcggcctgc gcatcatctt cgccgtgctg agcatcgtga accgcgtgcg ccagggctac 2 04 0 
agccccctga gcttccagac cctgaccccc aacccccgcg gcctggaccg cctgggccgc 2100 
atcgaggagg agggcggcga gcaggaccgc gaccgcagca tccgcctggt gcagggcttc 2160 
ctggccctgg cctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2220 
gacctgatcc tggtgaccgc ccgcgtggtg gagctgctgg gccgcagcag cccccgcggc 2280 
ctgcagcgcg gctgggaggc cctgaagtac ctgggcagcc tggtgcagta ctggggcctg 2340 
gagctgaaga agagcgccac cagcctgctg gacagcatcg ccatcgccgt ggccgagggc 2400 
accgaccgca tcatcgaggt gatccagcgc atctaccgcg ccttctgcaa catcccccgc 2460 
cgcgtgcgcc agggcttcga ggccgccctg cag 2493 

<210> 15 
<211> 2565 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
signal sequence and gpl60 coding region of HIV 
strain AF110975 



<400> 15 

atgcgcgtgc gcggcatcct gcgcagctgg cagcagtggt ggatctgggg catcctgggc 60 
ttctggatct gcagcggcct gggcaacctg tgggtgaccg tgtacgacgg cgtgcccgtg 120 
tggcgcgagg ccagcaccac cctgttctgc gccagcgacg ccaaggccta cgagaaggag 180 
gtgcacaacg tgtgggccac ccacgcctgc gtgcccaccg accccaaccc ccaggagatc 240 
gagctggaca acgtgaccga gaacttcaac atgtggaaga acgacatggt ggaccagatg 300 
cacgaggaca tcatcagcct gtgggaccag agcctgaagc cccgcgtgaa gctgaccccc 360 
ctgtgcgtga ccctgaagtg caccaactac agcaccaact acagcaacac catgaacgcc 420 
accagctaca acaacaacac caccgaggag atcaagaact gcaccttcaa catgaccacc 480 
gagctgcgcg acaagaagca gcaggtgtac gccctgttct acaagctgga catcgtgccc 540 
ctgaacagca acagcagcga gtaccgcctg atcaactgca acaccagcgc catcacccag 600 
gcctgcccca aggtgagctt cgaccccatc cccatccact actgcgcccc cgccggctac 660 
gccatcctga agtgcaagaa caacaccagc aacggcaccg gcccctgcca gaacgtgagc 720 
accgtgcagt gcacccacgg catcaagccc gtggtgagca cccccctgct gctgaacggc 780 
agcctggccg agggcggcga gatcatcatc cgcagcaaga acctgagcaa caacgcctac 840 
accatcatcg tgcacctgaa cgacagcgtg gagatcgtgt gcacccgccc caacaacaac 900 
acccgcaagg gcatccgcat cggccccggc cagaccttct acgccaccga gaacatcatc 960 
ggcgacatcc gccaggccca ctgcaacatc agcgccggcg agtggaacaa ggccgtgcag 1020 
cgcgtgagcg ccaagctgcg cgagcacttc cccaacaaga ccatcgagtt ccagcccagc 1080 
agcggcggcg acctggagat caccacccac agcttcaact gccgcggcga gttcttctac 1140 
tgcaacacca gcaagctgtt caacagcagc tacaacggca ccagctaccg cggcaccgag 1200 
agcaacagca gcatcatcac cctgccctgc cgcatcaagc agatcatcga catgtggcag 1260 
aaggtgggcc gcgccatcta cgcccccccc atcgagggca acatcacctg cagcagdagc 1320 
atcaccggcc tgctgctggc ccgcgacggc ggcctggaca acatcaccac cgagatcttc 1380 
cgcccccagg gcggcgacat gaaggacaac tggcgcaacg agctgtacaa gtacaaggtg 1440 
gtggagatca agcccctggg cgtggccccc accgaggcca agcgccgcgt ggtggagcgc 1500 
gagaagcgcg ccgtgggcat cggcgccgtg atcttcggct tcctgggcgc cgccggcagc 1560 
aacatgggcg ccgccagcat caccctgacc gcccaggccc gccagctgct gagcggcatc 1620 
gtgcagcagc agagcaacct gctgcgcgcc atcgaggccc agcagcacat gctgcagctg 1680 
accgtgtggg gcatcaagca gctgcaggcc . cgcgtgctgg ccatcgagcg ctacctgaag 1740 
gaccagcagc tgctgggcat ctggggctgc agcggcaagc tgatctgcac caccaccgtg 1800 
ccctggaaca gcagctggag caacaagacc cagggcgaga tctgggagaa catgacctgg i860 
atgcagtggg acaaggagat cagcaactac accggcatca tctaccgcct gctggaggag 192 0 
agccagaacc agcaggagca gaacgagaag gacctgctgg ccctggacag ccgcaacaac 1980 
ctgtggagct ggttcaacat cagcaactgg ctgtggtaca tcaagatctt catcatgatc 204 0 
gtgggcggcc tgatcggcct gcgcatcatc ttcgccgtgc tgagcatcgt gaaccgcgtg 2100 
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cgccagggct acagccccct gagcttccag accctgaccc ccaacccccg cggcctggac 2160 
cgcctgggcc gcatcgagga ggagggcggc gagcaggacc gcgaccgcag catccgcctg 2220 
gtgcagggct tcctggccct ggcctgggac gacctgcgca gcctgtgcct gttcagctac 2280 
caccgcctgc gcgacctgat cctggtgacc gcccgcgtgg tggagctgct gggccgcagc 2340 
agcccccgcg gcctgcagcg cggctgggag gccctgaagt acctgggcag cctggtgcag 2400 
tactggggcc tggagctgaa gaagagcgcc accagcctgc tggacagcat cgccatcgcc 2460 
gtggccgagg gcaccgaccg catcatcgag gtgatccagc gcatctaccg cgccttctgc 2520 
aacatccccc gccgcgtgcg ccagggcttc gaggccgccc tgcag 2565 

<210> 16 
<211> 1056 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: synthetic a ' 
gp4l coding region of HIV strain AF110975 



<400> 16 

gccgtgggca 
gccgccagca 
cagagcaacc 
ggcatcaagc 
ctgctgggca 
agcagctgga 
gacaaggaga 
cagcaggagc 
tggttcaaca 
ctgatcggcc 
tacagccccc 
cgcatcgagg 
ttcctggccc 
cgcgacctga 
ggcctgcagc 
ctggagctga 
ggcaccgacc 
cgccgcgtgc 



tcggcgccgt 
tcaccctgac 
tgctgcgcgc 
agctgcaggc 
tctggggctg 
gcaacaagac 
tcagcaacta 
agaacgagaa 
tcagcaactg 
tgcgcatcat 
tgagcttcca 
aggagggcgg 
tggcctggga 
tcctggtgac 
gcggctggga 
agaagagcgc 
gcatcatcga 
gccagggctt 



gatcttcggc 
cgcccaggcc 
catcgaggcc 
ccgcgtgctg 
cagcggcaag 
ccagggcgag 
caccggcatc 
ggacctgctg 
gctgtggtac 
cttcgccgtg 
gaccctgacc 
cgagcaggac 
cgacctgcgc 
cgcccgcgtg 
ggccctgaag 
caccagcctg 
ggtgatccag 
cgaggccgcc 



ttcctgggcg 
cgccagctgc 
cagcagcaca 
gccatcgagc 
ctgatctgca 
atctgggaga 
atctaccgcc 
gccctggaca 
atcaagatct 
ctgagcatcg 
cccaaccccc 
cgcgaccgca 
agcctgtgcc 
gtggagctgc 
tacctgggca 
ctggacagca 
cgcatctacc 
c tgcag 



ccgccggcag 
tgagcggcat 
tgctgcagct 
gctacctgaa 
ccaccaccgt 
acatgacctg 
tgctggagga 
gccgcaacaa 
tcatcatgat 
tgaaccgcgt 
gcggcctgga 
gcatccgcct 
tgttcagcta 
tgggccgcag 
gcctggtgca 
tcgccatcgc 
gcgccttctg 



caacatgggc 
cgtgcagcag 
gaccgtgtgg 
ggaccagcag 
gccctggaac 
gatgcagtgg 
gagccagaac 
cctgtggagc 
cgtgggcggc 
gcgccagggc 
ccgcctgggc 
ggtgcagggc 
ccaccgcctg 
cagcccccgc 
gtactggggc 
cgtggccgag 
caacatcccc 



<210> 17 
<211> 492 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 17 

Met Gly Ala Arg Ala Ser lie Leu Arg Gly Gly Lys Leu Asp Ala Trp 
15 10 15 • 

Glu Arg lie Arg Leu Arg Pro Gly Gly Lys Lys Cys Tyr Met Met Lys 
20 25 30 

His Leu Val Trp Ala Ser Arg Glu Leu Glu Lys Phe Ala Leu Asn Pro 
35 40 45 

Gly Leu Leu Glu Thr Ser Glu Gly Cys Lys Gin lie lie Arg Gin Leu 
50 55 60 

His Pro Ala Leu Gin Thr Gly Ser Glu Glu Leu Lys Ser Leu Phe Asn 
65 70 75 80 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1056 
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Thr Val Ala Thr Leu Tyx Cys Val His Glu Lys He Glu Val Arg Asp 

85 : 90 : . \. . 95 , ^ ^ 

Thr Lys Glu Ala Leu Asp Lys He Glu Glu Glu Gin Asn Lys Cys Gin 
100 105 110 

Gin Lys He Gin Gin Ala Glu Ala Ala Asp Lys Gly Lys Val Ser Glri 
115 120 125 

Asn Tyr Pro He Val Gin Asn Leu Gin Gly Gin Met Val His Gin Ala 
130 135 140 

He Ser Pro Arg Thr Leu Asn Ala Trp Val Lys Val He Glu Glu Lys 
145 150 155 160 

Ala Phe Ser Pro Glu Val He Pro Met Phe Thr Ala Leu Ser Glu Gly 
165 170 175 

Ala Thr Pro Gin Asp Leu Asn Thr Met Leu Asn Thr Val Gly Gly His 
180 185 190 

Gin Ala Ala Met Gin Met Leu Lys Asp Thr He Asn Glu Glu Ala Ala 
195 200 205 

Glu Trp Asp Arg Val His Pro Val His Ala Gly Pro He Ala Pro Gly 
210 215 220 

Gin Met Arg Glu Pro Arg Gly Ser Asp He Ala Gly Thr Thr Ser Thr 
225 230 235 240 

Leu Gin Glu Gin He Ala Trp Met Thr Ser Asn Pro Pro He Pro Val 
245 250 255 

Gly Asp He Tyr Lys Arg Trp He He Leu Gly Leu Asn Lys He Val 
260 265 270 

Arg Met Tyr Ser Pro Val Ser He Leu Asp He Lys Gin Gly Pro Lys 
275 280 285 

Glu Pro Phe Arg Asp Tyr Val Asp Arg Phe Phe Lys Thr Leu Arg Ala 
290 295 300 

Glu Gin Ser Thr Gin Glu Val Lys Asn Trp Met Thr Asp Thr Leu Leu 
305 310 315 .320 

Val Gin Asn Ala Asn Pro Asp Cys Lys Thr He Leu Arg Ala Leu Gly 

325 330 335 

Pro Gly Ala Ser Leu Glu Glu Met Met Thr Ala Cys Gin Gly Val Gly 
340 345 350 

Gly Pro Ser His Lys Ala Arg Val Leu Ala Glu Ala Met Ser Gin Ala 
355 360 365 

Asn Thr Ser Val Met Met Gin Lys Ser Asn Phe Lys Gly Pro Arg Ara 
370 375 380 

He Val Lys Cys Phe Asn Cys Gly Lys Glu Gly His He Ala Arg Asn 

390 395 400 
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Cys Arg Ala Pro Arg Dys Jsya Gly Cys Trp Lys Cys Gly Lys Glu Gly 

405 410 415 - ' 

His Gin Met Lys Asp Cys Thr Glu Arg Gin Ala Asn Phe Leu Gly Lys 

420 425 430 

lie Trp Pro Ser His Lys Gly Arg Pro Gly Asn Phe Leu Gin Ser Arg — 

435 440 445 

Pro Glu Pro Thr Ala Pro Pro Ala Glu Ser Phe Arg Phe Glu Glu Thr 
450 455 460 

Thr Pro Gly Gin Lys Gin Glu Ser Lys Asp Arg Glu Thr Leu Thr Ser 
465 470 475 480 

Leu Lys Ser Leu Phe Gly Asn Asp Pro Leu Ser Gin 
485 490 



<210> 18 
<211> 81 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: synthetic 
signal sequence of HIV strain AF110968 

<400> 18 

atgcgcgtga tgggcatcct gaagaactac cagcagtggt ggatgtgggg catcctgggc 60 
ttctggatgc tgatcatcag c 81 

<210> 19 
<211> 72 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
signal sequence of HIV strain AF110975 

<400> 19 

atgcgcgtgc gcggcatcct gcgcagctgg cagcagtggt ggatctgggg catcctgggc 60 
ttctggatct gc 72 

<210> 20 
<211> 1479 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: synthetic Gag 
coding sequence of HIV strain AF110965 

<400> 20 

atgggcgccc gcgccagcat cctgcgcggc ggcaagctgg acgcctggga gcgcatccgc 60 
ctgcgccccg gcggcaagaa gtgctacatg atgaagcacc tggtgtgggc cagccgcgag 12 0 
ctggagaagt tcgccctgaa ccccggcctg ctggagacca gcgagggctg caagcagatc 180 
atccgccagc tgcaccccgc cctgcagacc ggcagcgagg agctgaagag cctgttcaac 24 0 
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accgtggcca ccctgtactg cgtgcacgag aagatcgagg tgcgcgacac caaggaggcc 300 
ctggacaaga tcgaggagga gcagaacaag tgccagcaga agatccagca ggccgaggcc 360 
gccgacaagg gcaaggtgag ccagaactac cccatcgtgc agaacctgca gggccagatg 420 
gtgcaccagg ccatcagccc ccgcaccctg aacgcctggg tgaaggtgat cgaggagaag 4 80 
gccttcagcc ccgaggtgat ccccatgttc accgccctga gcgagggcgc caccccccag 540 
gacctgaaca ccatgctgaa caccgtgggc ggccaccagg ccgccatgca gatgctgaag 600 
gacaccatca acgaggaggc cgccgagtgg gaccgcgtgc accccgtgca cgccggcccc 660 
atcgcccccg gccagatgcg cgagccccgc ggcagcgaca tcgccggcac caccagcacc 720 
ctgcaggagc agatcgcctg gatgaccagc aaccccccca tccccgtggg cgacatctac 780 
aagcgctgga tcatcctggg cctgaacaag atcgtgcgca tgtacagccc cgtgagcatc 840 
ctggacatca agcagggccc caaggagccc ttccgcgact acgtggaccg cttcttcaag 900 
accctgcgcg ccgagcagag cacccaggag gtgaagaact ggatgaccga caccctgctg 960 
gtgcagaacg ccaaccccga ctgcaagacc atcctgcgcg ccctgggccc cggcgccagc 1020 
ctggaggaga tgatgaccgc ctgccagggc gtgggcggcc ccagccacaa ggcccgcgtg 1080 
ctggccgagg ccatgagcca ggccaacacc agcgtgatga tgcagaagag caacttcaag 1140 
ggcccccgcc gcatcgtgaa gtgcttcaac tgcggcaagg agggccacat cgcccgcaac 1200 
tgccgcgccc cccgcaagaa gggctgctgg aagtgcggca aggagggcca ccagatgaag 1260 
gactgcaccg agcgccaggc caacttcctg ggcaagatct ggcccagcca caagggccgc 1320 
cccggcaact tcctgcagag ccgccccgag cccaccgccc cccccgccga gagcttccgc 13 80 
ttcgaggaga ccacccccgg ccagaagcag gagagcaagg accgcgagac cctgaccagc 1440 
ctgaagagcc tgttcggcaa cgaccccctg agccagtaa 1479 

<210> 21 
<211> 1509 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Gag 
coding sequence of HIV strain 7^110967 



<400> 21 

s^tgggcgccc gcgccagcat cctgcgcggc 
ctgcgccccg gcggcaagaa gcactacatg 
ctggagggct tcgccctgaa ccccggcctg 
atgaagcagc tgcagcccgc cctgcagacc 
accgtggcca ccctgtactg cgtgcacgcc 
ctggacaaga tcgaggagga gcagaacaag 
gccgacggca aggtgagcca gaactacccc 
caccaggcca tcagcccccg caccctgaac 
ttcagccccg aggtgatccc catgttcacc 
ctgaacacca tgctgaacac cgtgggcggc 
accatcaacg aggaggccgc cgagtgggac 
gcccccggcc agatgcgcga cccccgcggc 
caggagcaga tcgcctggat gaccagcaac 
cgctggatca tcctgggcct gaacaagatc 
gacatccgcc agggccccaa ggagcccttc 
ctgcgcgccg agcaggccac ccaggacgtg 
cagaacgcca accccgactg caagaccatc 
gaggagatga tgaccgcctg ccagggcgtg 
gccgaggcca tgagccaggc caacagcgtg 
ggcccccgcc gcaacgtgaa gtgcttcaac 
tgccgcgccc cccgcaagaa gggctgctgg 
gactgcaccg agcgccaggc caacttcctg 
cccggcaact tcctgcagaa ccgcagcgag 
cccgccgaga gcttccgctt cgaggagacc 
cgcgagccct accgcgagcc cctgaccgcc 
agccagtaa 



gagaagctgg acaagtggga gaagatccgc 60 
ctgaagcacc tggtgtgggc cagccgcgag 120 
ctggagaccg ccgagggctg caagcagatc 180 
ggcaccgagg agctgcgcag cctgtacaac 240 
ggcatcgagg tgcgcgacac caaggaggcc 3 00 
agccagcaga agacccagca ggccaaggag 3 60 
atcgtgcaga acctgcaggg ccagatggtg 4 20 
gcctgggtga aggtgatcga ggagaaggcc 4 80 
gccctgagcg agggcgccac cccccaggac 540 
caccaggccg ccatgcagat gctgaaggac 600 
cgcctgcacc ccgtgcaggc cggccccgtg 660 
agcgacatcg ccggcgccac cagcaccctg 720 
ccccccgtgc ccgtgggcga catctacaag 780 
gtgcgcatgt acagccccgt gagcatcctg 840 
cgcgactacg tggaccgctt cttcaagacc 900 
aagaactgga tgaccgagac cctgctggtg 960 
ctgcgcgccc tgggccccgg cgccaccctg 102 0 
ggcggccccg gccacaaggc ccgcgtgctg 1080 
aacatcatga tgcagaagag caacttcaag 1140 
tgcggcaagg agggccacat cgccaagaac 1200 
aagtgcggca aggagggcca ccagatgaag 1260 
ggcaagatct ggcccagcca caagggccgc 1320 
cccgccgccc ccaccgtgcc caccgccccc 1380 
acccccgccc ccaagcagga gcccaaggac 144 0 
ctgcgcagcc tgttcggcag cggccccctg 1500 

1509 
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<210> 22 
. <211> 502 
<212> PRT 

<213> Human iminimodeficiency virus 
<400> 22 

Met Gly Ala Arg Ala Ser lie Leu Arg Gly Glu Lys Leu Asp Lys Trp 
1 5 10 15 

Glu Lys lie Arg Leu Arg Pro Gly Gly Lys Lys His Tyr Met Leu Lys 
.20 25 30 

His Leu Val Trp Ala Ser Arg Glu Leu Glu Gly Phe Ala Leu Asn Pro 
35 40 45 

Gly Leu Leu Glu Thr Ala Glu Gly Cys Lys Gin lie Met Lys Gin Leu 
50 55 60 

Gin Pro Ala Leu Gin Thr Gly Thr Glu Glu Leu Arg Ser Leu Tyr Asn . 
65 70 75 80 

Thr Val Ala Thr Leu Tyr Cys Val His Ala Gly He Glu Val Arg Asp 
85 90 95 

Thr Lys Glu Ala Leu Asp Lys He Glu Glu Glu Gin Asn Lys Ser Gin 
100 105 110 

Gin Lys Thr Gin Gin Ala Lys Glu Ala Asp Gly Lys Val Ser Gin Asn 
115 120 125 

Tyr Pro He Val Gin Asn Leu Gin Gly Gin Met Val His Gin Ala He 
130 135 140 

Ser Pro Arg Thr Leu Asn Ala Trp Val Lys Val He Glu Glu Lys Ala 
145 150 155 160 

Phe Ser Pro Glu Val He Pro Met Phe Thr Ala Leu Ser Glu Gly Ala 
165 170 175 

Thr Pro Gin Asp Leu Asn Thr Met Leu Asn Thr Val Gly Gly His Gin 
180 185 190 

TQa Ala Met Gin Met Leu Lys Asp Thr He Asn Glu Glu Ala Ala Glu 
195 200 205 

Trp Asp Arg Leu His Pro Val Gin Ala Gly Pro Val Ala Pro Gly Gin 
210 215 220 

Met Arg Asp Pro Arg Gly Ser Asp He Ala Gly Ala Thr Ser Thr Leu 
225 230 235 240 

Gin Glu Gin He Ala Trp Met Thr Ser Asn Pro Pro Val Pro Val Gly 
245 250 255 

Asp He Tyr Lys Arg Trp He He Leu Gly Leu Asn Lys He Val Arg 
260 265 270 

Met Tyr Ser Pro Val Ser He Leu Asp He Arg Gin Gly Pro Lys Glu 
275 280 285 
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Pro Phe Arg Asp Tyr Val Asp Arg Phe Phe Lys Thr Leu Arg Ala Glu 
290 . 295 300 

Gin Ala Thr Gin Asp Val Lys Asn Trp Met Thr Glu Thr Leu Leu Val 
305 310 315 320 

Gin Asn Ala Asn Pro Asp Cys Lys Thr lie Leu Arg Ala Leu Gly Pro 
325 330 335 

Gly Ala Thr Leu Glu Glu Met Met Thr Ala Cys Gin Gly Val Gly Gly 
340 , 345 350 

Pro Gly His Lys Ala Arg Val Leu Ala Glu Ala Met Ser Gin Ala Asn 
355 360 365 

Ser Val Asn lie Met Met Gin Lys Ser Asn Phe Lys Gly Pro Arg Arg 
370 375 380 

Asn Val Lys Cys Phe Asn Cys Gly Lys Glu Gly His lie Ala Lys Asn 
385 390 395 400 

Cys Arg Ala Pro Arg Lys Lys Gly Cys Trp Lys Cys Gly Lys Glu Gly 
405 410 415 

His Gin Met Lys Asp Cys Thr Glu Arg Gin Ala Asn Phe Leu Gly Lys 
420 425 430 

lie Trp Pro Ser His Lys Gly Arg Pro Gly Asn Phe Leu Gin Asn Arg 
435 440 445 

Ser Glu Pro Ala Ala Pro Thr Val Pro Thr Ala Pro Pro Ala Glu Ser 
450 455 460 

Phe Arg Phe Glu Glu Thr Thr Pro Ala Pro Lys Gin Glu Pro Lys Asp 
465 470 475 480 

Arg Glu Pro Tyr Arg Glu Pro Leu Thr Ala Leu Arg Ser Leu Phe Gly 
485 490 495 

Ser Gly Pro Leu Ser Gin 
500 



<210> 23 
<211> 849 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 23 

Met Arg Val Met Gly lie Leu Lys Asn Tyr Gin Gin Trp Trp Met Trp 
is 10 15 

Gly He Leu Gly Phe Trp Met Leu He He Ser Ser Val Val Gly Asn 
20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Lys 
35 40 45 
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Thr Thr Leu Phe Cye Thr Ser Asp Ala Lys Ala Tyr Glu Thr Glu Val 
50 55 60 

His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 - 



Gin Glu lie Val Leu Glu Asn Val Thr Glu Asn Phe Asn Met Trp Lys^ 
85 90 95 

Asn Asp Met Val Asp Gin Met His Glu Asp He He Ser Leu Trp Asp 
100 105 110 

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

Lys Cys Arg Asn Val Asn Ala Thr Asn Asn He Asn Ser Met He Asp 
130 135 140 

Asn Ser Asn Lys Gly Glu Met Lys Asn Cys Ser Phe Asn Val Thr Thr 
145 150 155 160 

Glu Leu Arg Asp Arg Lys Gin Glu Val His Ala Leu Phe Tyr Arg Leu 
165 170 175 

Asp Val Val Pro Leu Gin Gly Asn Asn Ser Asn Glu Tyr Arg Leu He 
180 185 190 

Asn Cys Asn Thr Ser Ala He Thr Gin Ala Cys Pro Lys Val Ser Phe 
195 200 205 

Asp Pro He Pro He His Tyr Cys Thr Pro Ala Gly Tyr Ala He Leu 
210 215 220 

Lys Cys Asn Asn Gin Thr Phe Asn Gly Thr Gly Pro Cys Asn Asn Val 
225 230 235 240 

Ser Ser Val Gin Cys Ala His Gly He Lys Pro Val Val Ser Thr Gin 
245 250 255 

Leu Leu Leu Asn Gly Ser Leu Ala Lys Gly Glu He He He Arg Ser 
260 265 270 

Glu Asn Leu Ala Asn Asn Ala Lys He He He Val Gin Leu Asn Lys 
275 280 285 

Pro Val Lys He Val Cys Val Arg Pro Asn Asn Asn Thr Arg Lys Ser 
290 295 300 

Val Arg He Gly Pro Gly Gin Thr Phe Tyr Ala Thr Gly Glu He He 
305 310 315 320 

Gly Asp He Arg Gin Ala Tyr Cys He He Asn Lys Thr Glu Trp Asn 
325 330 335 

Ser Thr Leu Gin Gly Val Ser Lys Lys Leu Glu Glu His Phe Ser Lys 
340 345 350 

Lys Ala lie Lys Phe Glu Pro Ser Ser Gly Gly Asp Leu Glu He Thr 
355 360 365 
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Thr His Ser Phe Asn Cys Arg Gly Glu Phe Phe Tyr Cys Asp Thr Ser 

370 375 380 i 

I' 

Gin Leu Phe Asn Ser Thr Tyr Ser Pro Ser Phe Asn Gly Thr Glu Asn ^ 

390 395 400 | 

ij 

Lys Leu Asn Gly Thr He Thr He Thr Cys Arg He Lys Gin He He 1 
405 410 415 i 

f 

Asn Met Trp Gin Lys Val Gly Arg Ala Met Tyr Ala Pro Pro He Ala I 
420 425 430 i 

i 

Gly Asn Leu Thr Cys Glu Ser Asn He Thr Gly Leu Leu Leu Thr Arg I 
435 440 445 | 



Asp Gly Gly Lys Thr Gly Pro Asn Asp Thr Glu He Phe Ara Pro Glv ^ 
450 455 460 ^ ^ | 

Gly Gly Asp Met Arg Asp Asn Trp Arg Asn Glu Leu Tyr Lys Tyr Lys I* 
465 470 475 480 X 

Val Val Glu He Lys Pro Leu Gly Val Ala Pro Thr Glu Ala Lys Arg ^ 
485 490 4 95 

Arg Val Val Glu Arg Glu Lys Arg Ala Val Gly He Gly Ala Val Phe 

500 505 510 > 

■•I 

Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Ala Ser He * 
515 520 525 

Thr Leu Thr Val Gin Ala Arg Leu Leu Leu Ser Gly He Val Gin Gin 
530 535 540 

Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Leu Leu Gin 

550 555 560 

Leu Thr Val Trp Gly He Lys Gin Leu Gin Thr Arg He Leu Ala Val 
565 570 575 

Glu Arg Tyr Leu Lys Asp Gin Gin Leu Leu Gly He Trp Gly Cys Ser 
580 585 590 

Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp Asn Ser Ser Trp Ser 

595 600 605 ; 

Asn Arg Ser His Asp Glu He Trp Asp Asn Met Thr Trp Met Gin TSrp 

615 620 

Asp Arg Glu He Asn Asn Tyr Thr Asp Thr He Tyr Arg Leu Leu Glu 

"0 635 640 

Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Lys Asp Leu Leu Ala Leu 
645 650 655 

Asp Ser Trp Gin Asn Leu Trp Asn Trp Phe Ser He Thr Asn Trp Leu 
660 665 670 

Trp Tyr He Lys He Phe He Met He Val Gly Gly Leu He Gly Leu 
675 680 685 
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Arg lie lie Phe Ala Val Leu Ser He Val Asn Arg Val Arg Gin Gly 
690 695 700 

Tyr Ser Pro Leu Pro Phe Gin Thr Leu Thr Pro Asn Pro Arg Glu Pro 
705 710 715 720 

Asp Arg Leu Gly Arg He Glu Glu Glu Gly Gly Glu Gin Asp Arg Gly 
725 730 735 

Arg Ser He Arg Leu Val Ser Gly Phe Leu Ala Leu Ala Trp Asp Asp 
740 745 750 

Leu Arg Ser Leu Cys Leu Phe Ser Tyr His Arg Leu Arg Asp Phe He 
755 760 765 

Leu He Ala Ala Arg Val Leu Glu Leu Leu Gly Gin Arg Gly Trp Glu 
770 775 780 

Ala Leu Lys Tyr Leu Gly Ser Leu Val Gin Tyr Trp Gly Leu Glu Leu 
785 790 795 800 

Lys Lys Ser Ala He Ser Leu Leu Asp Thr He Ala He Ala Val Ala 
805 810 815 

Glu Gly Thr Asp Arg He He Glu Phe He Gin Arg He Cys Arg Ala 
820 825 830 

He Arg Asn He Pro Arg Arg He Arg Gin Gly Phe Glu Ala Ala Leu 
835 840 845 

Gin 



<210> 24 
<211> 855 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 24 

Met Arg Val Arg Gly He Leu Arg Ser Trp Gin Gin Trp Trp He Trp 
15 10 15 

Gly He Leu Gly Phe Trp He Cys Ser Gly Leu Gly Asn Leu Trp Val 
20 25 30 

Thr Val Tyr Asp Gly Val Pro Val Trp Arg Glu Ala Ser Thr Thr Leu 
35 40 45 

Phe Cys Ala Ser Asp Ala Lys Ala Tyr Glu Lys Glu Val His Asn Val 
50 55 , 60 

Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro Gin Glu He 
^5 70 75 80 

Glu Leu Asp Asn Val Thr Glu Asn Phe Asn Met Trp Lys Asn Asp Met 
85 90 95 
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Val Asp Gin Met His Glu Asp lie lie Ser Leu Trp Asp Gin Ser Leu 
100 105 1X0 

Lys Pro Arg Val Lys Leu Thr Pro Leu Cys Val Thr Leu Lys Cys Thr 
115 120 125 

Asn Tyr Ser Thr Asn Tyr Ser Asn Thr Met Asn Ala Thr Ser Tyr Asn 
130 135 140 

Asn Asn Thr Thr Glu Glu lie Lys Asn Cys Thr Phe Asn Met Thr Thr 
"5 150 155 160 

Glu Leu Arg Asp Lys Lys Gin Gin Val Tyr Ala Leu Phe Tyr Lys Leu 
165 170 175 

Asp lie Val Pro Leu Asn Ser Asn Ser Ser Qlu Tyr Arg Leu He Asn 
180 185 190 

Cys Asn Thr Ser Ala He Thr Gin Ala Cys Pro Lys Val Ser Phe Asp 
195 200 205 

Pro He Pro He His Tyr Cys Ala Pro Ala Gly Tyr Ala He Leu Lys 
210 215 220 

Cys Lys Asn Asn Thr Ser Asn Gly Thr Gly Pro Cys Gin Asn Val Ser 
225 230 235 240 

Thr Val Gin Cys Thr His Gly He Lys Pro Val Val Ser Thr Pro Leu 
245 250 255 

Leu Leu Asn Gly Ser Leu Ala Glu Gly Gly Glu He He He Arg Ser 
260 265 270 

Lys Asn Leu Ser Asn Asn Ala Tyr Thr He He Val His Leu Asn Asp 
275 280 285 

Ser Val Glu He Val Cys Thr Arg Pro Asn Asn Asn Thr Arg Lys Gly 
290 295 300 

He Arg He Gly Pro Gly Gin Thr Phe Tyr Ala Thr Glu Asn He He 
305 310 315 320 

Gly Asp He Arg Gin Ala His Cys Asn He Ser Ala Gly Glu Trp Asn 
325 330 335 

Lys Ala Val Gin Arg Val Ser Ala Lys Leu Arg Glu His Phe Pro Asn 
340 345 350 

Lys Thr He Glu Phe Gin Pro Ser Ser Gly Gly Asp Leu Glu He Thr 
355 360 365 

Thr His Ser Phe Asn Cys Arg Gly Glu Phe Phe Tyr Cys Asn Thr Ser 
370 375 380 

Lys Leu Phe Asn Ser Ser Tyr Asn Gly Thr Ser Tyr Arg Gly Thr Glu 

390 395 400 

Ser Asn Ser Ser He He Thr Leu Pro Cys Arg He Lys Gin He He 
405 410 415 
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Asp Met Trp Gin bys Val Gly Arg Ala He Tyr Ala Pro Pro He Glu 
420 425 

Gly Asn He Thr Cys Ser Ser Ser He Thr Gly Leu Leu Leu Ala Arg 
435 440 445 

Asp Gly Gly Leu Asp Asn He Thr Thr Glu He Phe Arg Pro Gin Gly 
450 455 460 

Gly Asp Met Lys Asp Asn Trp Arg Asn Glu Leu Tyr Lys Tyr Lys Val 
465 470 475 480 

Val Glu He Lys Pro Leu Gly Val Ala Pro Thr Glu Ala Lys Arg Arg 
485 490 495 

Val Val Glu Arg Glu Lys Arg Ala Val Gly He Gly Ala Val He Phe 
500 505 510 

Gly Phe Leu Gly Ala Ala Gly Ser Asn Met Gly Ala Ala Ser He Thr 
515 520 525 

Leu Thr Ala Gin Ala Arg Gin Leu Leu Ser Gly He Val Gin Gin Gin 
530 535 540 

Ser Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Met Leu Gin Leu 
545 550 555 560 

Thr Val Trp Gly He Lys Gin Leu Gin Ala Arg Val Leu Ala He Glu 
565 570 575 

Arg Tyr Leu Lys Asp Gin Gin Leu Leu Gly He Trp Gly Cys Ser Gly 
580 585 590 

Lys Leu He Cys Thr Thr Thr Val Pro Trp Asn Ser Ser Trp Ser Asn 
595 600 605 

Lys Thr Gin Gly Glu He Trp Glu Asn Met Thr Trp Met Gin Trp Asp 

610 615 620 

Lys Glu He Ser Asn Tyr Thr Gly He He Tyr Arg Leu Leu Glu Glu 
625 630 635 640 

Ser Gin Asn Gin Gin Glu Gin Asn Glu Lys Asp Leu Leu Ala Leu Asp 
645 650 655 

Ser Arg Asn Asn Leu Trp Ser Trp Phe Asn He Ser Asn Trp Leu Trp 
660 665 670 

Tyr He Lys He Phe He Met He Val Gly Gly Leu He Gly Leu Arg 
675 680 685 

He He Phe Ala Val Leu Ser He Val Asn Arg Val Arg Gin Gly Tyr 
690 695 700 

Ser Pro Leu Ser Phe Gin Thr Leu Thr Pro Asn Pro Arg Gly Leu Asp 
705 710 715 720 

Arg Leu Gly Arg He Glu Glu Glu Gly Gly Glu Gin Asp Arg Asp Arg 
725 730 735 
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Ser lie Arg Leu Val Gin Gly Phe Leu Ala Leu Ala Trp Asp Asp Leu 
740 745 750 

Arg Ser Leu Cys Leu Phe Ser Tyr His Arg Leu Arg Asp Leu lie Leu 
755 760 765 

Val Thr Ala Arg Val Val Glu Leu Leu Gly Arg Ser Ser Pro Arg Gly 
770 • 775 780 

Leu Gin Arg Gly Trp Glu Ala Leu Lys Tyr Leu Gly Ser Leu Val Gin 
785 790 795 800 

Tyr Trp Gly Leu Glu Leu Lys Lys Ser Ala Thr Ser Leu Leu Asp Ser 
805 .810 815 

lie Ala lie Ala Val Ala Glu Gly Thr Asp Arg lie He Glu Val He 
820 825 830 

Gin Arg He Tyr Arg Ala Phe Cys Asn He Pro Arg Arg Val Arg Gin 
835 840 845 

Gly Phe Glu Ala Ala Leu Gin 
850 855 



<210> 25 
<211> 20 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 25 

Asp He Lys Gin Gly Pro Lys Glu Pro Phe Arg Asp Tyr Val Asp Arg 
15 10 15 

Phe Phe Lys Thr 
20 



<210> 26 
<211> 60 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 26 

gacataaaac aaggaccaaa agagcccttt agagactatg tagaccggtt ctttaaaacc 60 

<210> 27 
<211> 20 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 27 

Asp He Arg Gin Gly Pro Lys Glu Pro Phe Arg Asp Tyr Val Asp Arg 
15 10 15 

Phe Phe Lys Thr 
20 
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<210> 28 
<2li> 47 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 28 

Thr He Thr He Thr Cys Arg He Lys Gin He He Asn Met Trp Gin 
1 5 10 15 

I*ys Val Gly Arg Ala Met Tyr Ala Pro Pro He Ala Gly Asn Leu Thr 
20 25 30 

Cys Glu Ser Asn He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly 
35 40 45 



<210> 29 
<211> 48 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 29 

Ser He He Thr Leu Pro Cys Arg He Lys Gin He He Asp Met Trp 
1 5 10 15 

Gin Lys Val Gly Arg Ala He Tyr Ala Pro Pro He Glu Gly Asn He 
20 25 30 

Thr Cys Ser Ser Ser He Thr Gly Leu Leu Leu Ala Arg Asp Gly Gly 
35 40 45 
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